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PREFACE 

The  work  reported  here  was  performed  in  the  Optical  and  IR  Science 
Laboratory  of  the  Advanced  Concepts  Division,  Environmental  Research 
Institute  of  Michigan  (ERIM).  The  work  was  sponsored  by  the  Office  of 
Naval  Research  (ONR),  Boston,  Contract  No.  N00014-86-C-0587,  funded 
from  the  Innovative  Science  and  Technology  Office  at  the  Strategic 
Defense  Initiative  Office  (SDIO/IST).  The  project  monitor  at  ONR  was 
Dr.  Fred  W.  Quelle. 

This  final  technical  report  covers  work  performed  from  1  August 
1986  to  31  December  1989.  The  principal  investigator  was  James  R. 
Fienup.  Major  contributions  to  this  work  also  included  Jack  N. 
Cederquist,  John  D.  Gorman,  and  John  H.  Seldin. 

(Volume  1  of  the  Final  Report  is  by  J.N.  Cederquist,  J.R.  Fienup 
and  J.C.  Marron,  "High  Resolution  Imaging  by  Phase  Retrieval  and 
Discrimination  Using  Speckle,"  ERIM  Report  No.  201600-11-F,  March  1989, 
which  describes  work  sponsored  by  the  Office  of  Naval  Research  and  the 
Naval  Research  Laboratory.) 
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1.0  INTRODUCTION  AND  OVERVIEW 
1.1  BACKGROUND 

Discrimination  of  targets  from  decoys  can  be  done  using  imagery 
having  very  fine  resolution.  The  diffraction  limit  on  resolution, 
p  =  XR/D,  obtained  from  an  imaging  sensor  at  a  range  R  using  wavelength 
X  and  aperture  diameter  D,  implies  that,  for  SDI  midcourse 
discrimination  applications,  the  wavelength  must  be  very  short  and/or 
the  aperture  diameter  D  must  be  very  large.  Such  very  large  apertures 
would  be  impractical ly  heavy  and  difficult  to  steer  rapidly  in  space  if 
they  were  made  to  be  rigid  in  order  to  be  without  aberrations.  On  the 
other  hand,  mirrors  that  are  Inexpensive  and  lightweight  would  warp, 
causing  phase  errors  and  a  severe  blurring  of  the  imagery. 

An  approach  to  circumventing  these  problems  is  to  employ  cheap, 
lightweight  mirrors  and  obtain  fine-resolution  images  from  them  using 
phase  retrieval  algorithms.  By  this  approach,  a  computer  algorithm 
corrects  the  errors  after  the  data  is  collected.  With  the  increasing 
speed  and  decreasing  cost  of  computers,  this  trade-off  of  simpler 
optical  hardware  at  the  expense  of  additional  computational 
requirements  is  increasingly  attractive. 

Phase  retrieval  can  be  employed  to  greatly  improve  the  quality  of 
imagery  from  a  large  number  of  sensors.  In  this  study,  we  concentrated 
on  a  particular  Imaging  sensor,  the  Multi -Aperture  Amplitude 
Interferometer  (MAAI),  under  development  at  the  University  of  Maryland 
(UMd)  by  the  group  headed  by  Doug  Currie.  It  Is  essentially  a  multi¬ 
channel,  modernized  Michel  son  stellar  Interferometer  that  gathers  the 
Fourier  transform  of  the  target  Image,  with  all  the  spatial  frequency 
components  measured  simultaneously.  In  the  process  of  making  those 
measurements,  all  information  about  the  phase  of  the  complex-valued 
Fourier  transform  is  lost,  and  only  the  magnitude  of  the  Fourier 
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transform  (often  referred  to  as  the  visibility  function)  is  measured. 
This  limited  information  is  insufficient  to  compute  an  image  in  a 
straightforward  manner.  However  with  iterative  phase  retrieval 
algorithms,  developed  under  this  effort,  a  diffraction-limited  image 
can  be  reconstructed.  Aberrations  then  have  no  effect  on  the 
reconstructed  image,  and  so  fine  resolution  can  be  obtained  despite 
warping  of  the  mirror  or,  if  present,  atmospheric  turbulence. 

In  this  report  is  described  an  investigation  using  phase  retrieval 
algorithms  to  reconstruct  fine-resolution  images  from  an  aberrated 
system  (the  MAAI)  for  the  SDI  midcourse  discrimination  scenario. 
Section  1.2  gives  a  brief  overview  of  the  accomplishments  that  are 
described  in  detail  in  the  rest  of  the  report.  Section  1.3  gives 
recommendations  for  future  effort.  Section  2  describes  the  basic 
theory  behind  the  MAAI.  Section  3  shows  the  performance  of  data 
estimation  and  image  reconstruction  for  low  light  levels.  Section  4 
describes  an  analysis  of  the  imaging  performance  that  would  be  expected 
for  future  SDI  experiments.  Section  5  discusses  the  reconstruction  of 
images  for  the  case  of  partially-filled  apertures  as  would  occur  If  the 
telescope  has  a  central  obscuration.  Section  6  describes  alternative 
geometries  within  the  MAAI  that  would  enable  it  to  measure  low  spatial 
frequencies  despite  a  central  obscuration,  which  would  be  useful  for 
ground-based  experiments.  Section  7  describes  an  alternative  new  phase 
retrieval  algorithm  based  on  a  blind  deconvolution  algorithm. 
Section  8  explores  the  probability  that  an  image  reconstructed  by  a 
phase  retrieval  algorithm  Is  not  unique.  Section  9  shows  the 
computational  requirements  for  phase  retrieval  algorithms.  Section  10 
mentions  plans  towards  reconstruction  of  images  from  MAAI  data  gathered 
in  the  laboratory.  Additional  details  are  given  in  several  appendices. 
References  are  found  at  the  end  of  each  section. 
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1.2  OVERVIEW  OF  ACCOMPLISHMENTS 

In  this  section  the  principal  results  of  the  program  are  briefly 
summarized.  They  are  reported  in  detail  in  the  sections  and  appendices 
that  follow. 

The  basic  theory  of  the  MAAI  was  derived.  This  is  explained  in 
Section  2. 

A  signal  and  noise  model  for  the  MAAI  was  developed  and  analyzed. 
Several  estimators  for  the  object's  Fourier  magnitude  from  the  measured 
data  were  derived,  and  the  variance  of  the  estimate  was  calculated  as  a 
function  of  detected  photons  and  visibility  magnitude.  This  leads  to 
an  optimum  way  to  process  the  raw  data  prior  to  phase  retrieval. 
Digital  simulation  and  reconstruction  experiments  were  performed  to 
show  the  quality  of  imagery  that  would  be  reconstructed  at  different 
light  levels  and  for  different  types  of  objects.  This  is  described  in 
Section  3. 

For  parameters  of  actual  field  experiments  that  were  to  be 
performed,  the  data  was  simulated  and  images  were  reconstructed.  The 
scenario  that  was  simulated  was  the  imaging  of  the  first  Firefly 
exercise  (piggybacking  on  the  MIT  Lincoln  Laboratory  laser  radar 
experiment)  launched  from  Wallops  Island  as  would  be  viewed  by  the  MAAI 
attached  to  the  48-inch  telescope  at  Goddard  Space  Flight  Center. 
Light  levels  received  by  the  MAAI  assuming  sun  illumination  of  the 
target,  were  computed,  the  detected  data  was  simulated,  and  images  were 
reconstructed.  The  results  predicted  that  the  images  produced  from  the 
MAAI  data  from  the  Goddard  48-inch  telescope  would  be  of  poor  quality. 
A  limiting  factor  was  that  the  Goddard  48-inch  telescope  has  a  large 
central  obscuration,  preventing  the  measurement  of  the  low-to-mid 
spatial  frequencies,  where  most  of  the  information  resides.  However, 
if  the  low  spatial  frequencies  were  measured,  then  it  was  shown  that 
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good  quality  imagery  could  be  reconstructed.  This  could  be 
accomplished  by  changes  in  the  MAAI  (which  will  be  described  later)  or 
by  using  a  telescope  which  has  a  small  central  obscuration,  such  as  the 
24-inch  at  the  Innovative  Science  and  Technology  Experimental  Facility 
(ISTEF).  Then  for  the  same  scenario,  high-quality  images  would  be 
reconstructed  with  resolution  several  times  better  than  that  ordinarily 
allowed  by  atmospheric  turbulence.  Furthermore,  if  the  same  experiment 
were  performed  in  a  space-borne  MAAI  at  the  same  range,  then  excellent 
results  would  be  obtained,  even  with  shorter  integration  times.  This 
is  described  in  Section  4. 

For  the  case  of  partially-filled  aperture,  including  central 
obscurations  or  multiple-mirror  telescopes,  portions  of  the  spatial 
frequency  domain  are  not  measured.  Then  the  reconstruction  algorithm 
must  simultaneously  interpolate  the  phase  and  magnitude  values  where 
they  are  missing  while  retrieving  the  phase  where  the  magnitude  is 
measured.  This  is  a  particularly  difficult  task  if  the  lower  spatial 
frequencies  are  missing  because  of  a  central  obscuration  of  the 
telescope,  since  the  visibility  magnitude  at  lower  spatial  frequencies 
is  typically  much  larger  than  at  the  higher  spatial  frequencies. 
Algorithms  we  developed  to  overcome  this  problem  are  described  in 
Section  5. 

Another  way  to  get  around  the  problem  of  a  telescope  with  a  central 
obscuration  is  to  change  the  way  that  the  aperture  is  sheared  by  the 
interferometer  so  that  it  measures  the  lower  spatial  frequencies.  When 
this  is  done  the  highest  spatial  frequencies  are  lost,  but  the  net 
image  quality  can  be  far  higher  than  what  would  be  obtained  with  the 
traditional  method  of  shearing  the  wavefront.  This  is  important  for 
ground  based  experiments  using  existing  telescopes,  although  it  would 
probably  not  be  a  problem  for  an  eventual  space-based  system  for  which 
a  second  small  telescope  could  fill  the  need  for  the  low  spatial 
frequencies.  This  is  described  in  Section  6. 
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An  alternative  to  the  iterative  transform  phase  retrieval  algorithm 
(which  was  the  workhorse  algorithm  for  most  of  this  effort)  was 
developed.  It  is  a  version  of  the  Ayers-Dainty  blind  deconvolution 
algorithm  modified  to  solve  the  phase  retrieval  problem,  using  support 
and  nonnegativity  constraints.  This  is  described  in  Section  7. 

A  question  that  always  arises  for  image  reconstruction  by  phase 
retrieval  is  whether  the  image  obtained  is  unique.  If  it  were  likely 
that  other  images  were  also  consistent  with  the  data  and  constraints, 
then  the  method  would  not  be  reliable.  A  new  methodology  of 
quantifying  the  uniqueness  of  the  solution  was  developed  and  exercised. 
The  subspace  of  all  ambiguous  solutions  was  analytically  derived  for 
the  case  of  small  (2x3  pixels)  images.  Monte  Carlo  experiments  were 
conducted  to  determine  the  probability  that  a  random  image  would  lie 
within  a  certain  distance  of  this  subspace.  The  computation  was 
performed  for  several  different  cases.  This  is  reported  in  Section  8. 

The  computational  requirements  for  phase  retrieval  were  analyzed. 
Versions  of  the  algorithm  were  also  sent  to  other  researchers  to 
implement  on  particular  computer  architectures,  such  as  the  Carnegie- 
Mellon  Warp.  These  results  are  described  in  Section  9. 

Laboratory  experiments  were  Initiated,  including  preparation  of 
target  objects  and  porting  software  to  a  computer  at  the  University  of 
Maryland,  as  described  in  Section  10. 

Publications  arising  from  this  effort  are  given  below. 

"Image  Reconstruction  for  an  Aberrated  Amplitude  Interferometer  with  a 
Partially-Filled  Aperture,"  J.R.  Fienup  and  J.D.  Gorman,  Proceedings  of 
the  NOAO-ESO  Conference  on  High-Resolution  Imaging  by  Interferometry, 
15-18  March  1988,  Garching  bei  Munchen,  West  Germany. 

"Estimation  and  Reconstruction  from  Aberrated  Amplitude  Interferometer 
Measurements,"  J.D.  Gorman  and  J.R.  Fienup,  in  D.M.  Alloin  and  J.-M. 
Mariotti,  eds.,  Diffraction-Limited  Imagino  with  Very  Large  Telescopes. 
(Kluwer  Academic  Publishers,  Boston,  1989)  pp.  405-414. 
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"Phase-Retrieval  Imaging  for  SDI  Applications,"  J.R.  Fienup, 
Proceedings  of  the  SDIO/IST  Workshop  on  Sensor  Signal  Processing,  25-27 
April,  1989,  Leesburg,  VA. 

"Numerical  Investigation  of  Phase  Retrieval  Uniqueness,"  J.H.  Seldin 
and  J.R.  Fienup,  in  Signal  Recovery  and  Synthesis  III,  digest  of  papers 
(Optical  Society  America,  1989),  14-16  June  1989,  N.  Falmouth,  MA,  pp. 
120-123. 

"Numerical  Investigation  of  the  Uniqueness  of  Phase  Retrieval,"  J.H. 
Seldin  and  J.R.  Fienup,  J.  Opt.  Soc.  Am.  A  7,  pp.  412-427,  March  1990. 

"Phase  Retrieval  Using  Ayers/Dainty  Deconvolution,"  J.H.  Seldin  and 
J.R.  Fienup  in  Signal  Recovery  and  Synthesis  III,  digest  of  papers 
(O.S.A.,  1989),  14-16  June  1989,  N.  Falmouth,  MA,  pp.  124-127. 

"Iterative  Blind  Deconvolution  Algorithm  Applied  to  Phase  Retrieval," 
J.H.  Seldin  and  J.R.  Fienup,  J.  Opt.  Soc.  Am.  A  7,  pp.  428-433,  March 
1990. 

"Lower  Bounds  on  Parametric  Estimators  with  Constraints,"  J.D.  Gorman 
and  A.O.  Hero,  Fourth  Annual  ASSP  Workshop  on  Spectrum  Estimation  and 
Modeling,  August  1988. 

"Lower  Bounds  for  Parametric  Estimation  with  Constraints,"  J.D.  Gorman 
and  A.O.  Hero,  IEEE  Trans.  Inform.  Theory  36,  1285-1301  (1990). 


1.3  RECOMMENDATIONS 


Phase  retrieval  has  been  shown  via  computer  simulations  to  be  a 
means  of  obtaining  fine-resolution  Images,  important  for  discriminating 
targets  from  decoys,  from  a  badly-aberrated  large-aperture  telescope 
employing  an  amplitude  interferometer.  This  will  enable  the  generation 
of  fine-resolution  images  from  an  Imaging  system  that  is  much  cheaper, 
simpler,  and  lighter  in  weight  than  what  would  otherwise  be  possible 
with  competing  technologies  such  as  adaptive  optics.  It  is  recommended 
that  phase  retrieval  be  used  in  future  imaging  experiments  to 
demonstrate  its  capabilities  in  the  real  world,  that  it  be  further 
developed  to  increase  its  speed  and  reliability,  and  that  it  be 
automated.  The  analysis  of  the  uniqueness  of  the  reconstructed  image 
should  be  extended  to  include  the  case  of  larger,  more  realistic 
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images.  Further  analysis  should  be  performed  to  determine  which  of  the 
many  known  imaging  modalities  is  best  suited  to  the  SDI  midcourse 
discrimination  problem.  Phase  retrieval  can  also  be  used  to  improve 
the  images  obtained  with  other  types  of  imaging  modalities. 


2.0  AMPLITUDE  INTERFEROMETER  THEORY 


2.1  OVERVIEW  OF  THE  INTERFEROMETER 

In  this  section  we  describe  the  basic  theory  behind  the  amplitude 
interferometer  and  discuss  alternative  ways  to  arrive  at  an  estimate  of 
the  magnitude  of  the  coherence  function  from  it. 

The  multi -aperture  amplitude  interferometer  [2. 1,2. 2, 2. 3]  is 
essentially  a  highly  parallel,  multichannel,  Michelson  stellar 
interferometer  [2.4]  that  uses  a  pair  of  measurements  in  an  optimized 
measurement  scheme.  It  can  also  be  viewed  as  a  dual-channel  rotational 
shearing  interferometer  [2. 5, 2. 6]  with  a  180*  angle  of  rotation.  It  is 
presently  under  development  by  a  group  at  the  University  of  Maryland 
headed  by  D.G.  Currie.  A  full  description  of  the  multiaperture 
amplitude  interferometer  has  not  appeared  in  the  literature,  and  the 
description  that  follows  was  arrived  at  from  a  combination  of  the 
references  cited  above,  conversations  with  the  University  of  Maryland 
group,  and  our  own  analysis. 

From  the  data  collected  by  the  amplitude  interferometer  we  can 
compute  the  two-dimensional  modulus  (magnitude)  of  the  complex 
coherence  function  of  an  astronomical  object.  If  the  conditions  for 
the  validity  of  the  vaivCittert  Zernike  theorem  are  satisfied,  then  the 
complex  coherence  function  is  proportional  to  the  Fourier  transform  of 
the  two-dimensional  intensity  (brightness)  distribution  of  the  object 
under  measurement.  If  both  the  modulus  and  phase  of  the  complex 
coherence  function  could  be  computed,  then  one  could  obtain  an  image  of 
the  object  by  Fourier  transformation.  However,  atmospheric  turbulence 
and/or  telescope  aberrations  severely  distort  the  phase,  allowing  the 
determination  of  only  the  modulus  of  the  complex  coherence  function, 
which  is  known  as  the  visibility  function. 


~^RIM 


In  the  amplitude  interferometer,  the  incoming  field  is  split  into 
two  halves,  one  of  which  is  rotated  by  180®  with  respect  to  the  other. 
The  two  halves  are  then  interfered  and  detected.  The  beamsplitter  in 
the  interferometer  causes  the  interference  pattern  to  appear 
simultaneously  in  two  different  planes.  Both  of  these  interference 
patterns,  which  are  similar  to  one  another  yet  different  in  a  useful 
way,  are  detected.  From  them  the  modulus  of  the  complex  coherence 
function  can  be  computed.  The  amplitude  interferometer  has  an 
advantage  over  the  rotational -shearing  interferometer.  The  measurement 
of  the  pair  of  interference  patterns  largely  allows  for  the  correction 
of  the  effects  of  scintillation  [2.1]. 

From  the  squared  modulus  of  the  coherence  function  we  can  compute 
the  autocorrelation  function  of  the  object.  Reconstruction  of  an  image 
of  the  object  requires  the  retrieval  of  the  phase  of  the  complex 
coherence  function,  which  can  be  accomplished  using  a  phase  retrieval 
algorithm  [2. 7, 2. 8].  By  this  means  an  image  can  be  obtained  that  has 
several  times  finer  resolution  than  what  could  ordinarily  be  obtained 
through  the  turbulent  atmosphere  or  through  an  aberrated  telescope. 

2.2  THE  AMPLITUDE  INTERFEROMETER 

We  make  the  standard  assumptions  that  the  object  of  interest 
radiates  incoherently,  the  interferometer  is  in  the  far-field  of  the 
object,  and  the  detected  light  is  quasi -monochromatic.  Under  these 
conditions  the  van  Cittert-Zernike  theorem,  which  states  that  the 
object  brightness  distribution  is  the  Fourier  transform  of  the  complex 
coherence  function,  is  valid  [2.9].  We  also  assume  isoplanatism:  that 
the  effects  of  the  aberrations  are  modeled  by  a  random  phase-amplitude 
screen  appearing  at  the  entrance  pupil  of  the  interferometer,  and  its 
aberrating  effects  are  space-invariant. 
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The  amplitude  interferometer  was  originally  designed  to  measure 
stellar  diameters  by  making  one-dimensional  measurements  of  the  modulus 
of  the  complex  coherence  function.  This  one-dimensional  interferometer 
receives  recollimated  light  from  a  telescope,  and  consists  of  a 
Koster's  prism,  spectral  filters,  and  photomultiplier  tubes  at  each 
output  arm  of  the  prism.  This  arrangement  allowed  the  measurement  of 
the  interference  between  a  pair  of  pinholes  with  variable  separation. 
This  type  of  measurement  was  sufficient  for  stellar  diameter 
measurements.  In  the  current  amplitude  interferometer,  the  multi - 
aperture  amplitude  interferometer  (MAAI),  which  is  illustrated  in 
Figure  2-1,  the  photomultiplier  tubes  have  been  replaced  by  2-D  CCD 
arrays  and  additional  optics  have  been  incorporated  between  the 
collimator  and  the  Koster's  prism,  making  it  capable  of  making  two- 
dimensional  measurements.  These  measurements  are  made  in  a  plane  that 
is  a  demagnified  version  of  the  aperture  (pupil)  plane. 

The  key  optical  component  of  the  amplitude  interferometer  is  a 
Koster's  prism.  The  prism  acts  as  a  beamsplitter,  combining  two 
incident  optical  fields.  If  an  intensity  detector  is  placed  at  an 
output  of  the  prism,  what  is  measured  includes  a  term  proportional  to 
the  coherence  function  of  the  incident  field.  This  principle  is  used 
to  measure  the  modulus  of  the  complex  coherence  function  of  the  object. 
In  our  discussion  we  assume  an  Ideal  Koster's  prism.  Li  ewer  [2.3] 
discusses  the  effects  of  a  nonideal  prism. 

A  complex-valued  optical  field  U(x,y,t)  enters  the  interferometer 
from  a  telescope  and  is  split  into  half  fields.  One  half  field  passes 
through  two  mirror  reflections  and  into  one  side  of  the  Koster's  prism. 
The  other  half  passes  through  three  mirror  reflections  and  into  the 
other  side  of  the  prism  (Figure  2-1).  The  mirrors  between  the 
telescope  and  the  prism  act  to  invert  one  of  the  halves  about  the 
horizontal  axis,  making  it  U(x,-y,t),  while  the  other  half  remains 
unchanged.  These  two  halves  are  combined  with  the  beamsplitting  action 
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•  Amplitude  i-terferometer  (Currie  1967,  1974) 


Figure  2-1.  Functional  Diagram  of  the  Multi -Aperture  Amplitude 
Interferometer. 


of  the  prism.  A  simple  ray-tracing  argument  can  be  used  to  show  that 
the  transmitted  beam  undergoes  a  constant  phase  shift  of  6-^  and  an 
inversion  about  the  vertical  axis  while  the  reflected  beam  undergoes  a 
constant  phase  shift  of  Assume  that  U(x,y,t)  enters  the  left  side 

of  the  prism  and  the  inverted  beam  U(x,-y,t)  enters  the  right  side  of 
the  prism.  Then  the  complex  field  of  the  beam  output  on  the  left  side 
of  the  prism,  denoted  as  beam  1,  is 

Vj(x,y,t)  =  —  {u(x,y,t)  e  +  U(-x,-y,t)  e  ']  .  (2-1) 

where  the  \/rr  factor  is  required  for  energy  conservation.  Similarly, 
the  output  complex  field  on  the  right  side  of  the  prism  is 

V2(x,y,t)  *  {u(-x,y,t)  e  ^  +  U(x,-y,t)  e  .  (2-2) 

Let  <»>^  denote  a  time  average  over  the  interval  [t,t+r];  that  is, 

t+T 

<f(t)>^  *  T  I  *  (2-3) 

t 

In  the  context  of  our  model,  r  represents  the  single-frame  integration 
time  of  the  CCD  array,  which  would  typically  be  on  the  order  of  1  msec 
to  10  msec  for  the  case  of  atmospheric  turbulence.  Then  the  detected 
intensity  of  beam  1  is 

IjCx.y.t)  =  <IVj(x,y,t)l^>^ 

=  I  {<!U(x,y,t)l^>^  +  <IU(-x,-y,t)l^>^ 

+  <U(x,y,t)  U*(-x,-y,t)>^  e^^  +  c.c.j 
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=  I  {l(x.y,t)  +  I(-x,-y,t)  +  <U(x,y,t)  U*(-x,-y,t)>^  +  c.c.) 

(2-4) 

2 

where  5  =  I(x,y,t)  =  <IU{x,y,t)  r>,-,  and  c.c.  denotes  the 

complex  conjugate  of  the  preceding  term.  For  an  ideal  beamsplitter  6  = 

t/2. 


The  optical  field  in  the  aperture  plane  is  assumed  to  be  given  by 
U(x,y,t)  =  U^Cx.y.t)  exp[o(x,y,t)  +  i^(x,y,t)]  (2-5) 

where 

UQ(x,y,t)  =  UQ(x,y)  exp(iwt)  (2-6) 

is  the  quasi  monochromatic  optical  field  of  wavelength  X  =  2irc/«  due  to 
the  object  in  the  absence  of  atmospheric  effects,  a(x,y,t)  is  the 
intensity-modulating  effect  (scintillation)  of  atmospheric  turbulence 
(the  log-amplitude  function)  [2.9,  pp.  398,  404],  /?(x,y,t)  is  the  phase 
error  induced  by  atmospheric  turbulence  or  aberrated  optics,  and  c  is 
the  speed  of  light. 

We  assume  that  the  integration  time  r  is  many  times  the  coherence 
time  of  the  optical  field,  which  is  approximately  the  reciprocal  of  the 
bandwidth.  A*/  =  Lu/Zr,  of  the  radiation.  Thus  the  mutual  intensity  of 
the  incident  optical  field  due  to  the  object  is  given  by 

r(Ax,Ay)  =  ir(Ax,Ay)l  exp [i^ (Ax, Ay)] 

=  <Uo(x,y,t)  U*(x  -  Ax,  y  -  Ay,  t)>^ 

=  Iq  7(Ax,Ay)  (2-7) 
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where  1^  =  r(0,0)  =  <IUQ(x,y,t) l^>7-  is  the  average  intensity  and  the 
normalized  quantity  7(Ax,Ay)  is  the  complex  coherence  function. 
(7(Ax,Ay)  is  usually  denoted  by  /<j2  =  [2*9,  p.  183];  however  we 

use  the  symbol  7  to  be  consistent  with  the  notation  of  earlier 
publications  on  the  amplitude  interferometer.) 

Inserting  Eqs.  (2-5)  to  (2-7)  into  Eq.  (2-4),  and  assuming  that 
a(x,y,t)  and  /?(x,y,t)  are  constant  over  the  time  interval  r,  yields 

Il(x,y,t)  =  (Iq/2)  {exp[2a(x,y,t)]  +  exp[2a(-x,-y,t)] 

+  7(2x,2y)  exp[a(x,y,t)  +  a(-x,-y,t) 

+  i^(x,y,t)  -  i^(-x,-y,t)]  exp(i5)  +  c.c.}.  (2-8) 

For  an  ideal  beamsplitter,  with  A  =  *  t/2,  this  becomes 

Il(x,y,t)  =  (Iq/2)  {exp[2a(x,y,t)]  +  exp[2o(-x,-y,t)] 

-  2  exp[a(x,y,t)  +  o(-x,-y,t)]  l7(2x,2y)l 

sin[^(2x,2y)  +  ^(x,y,t)  -  ^(-x,-y,t)]}  .  (2-9) 

I7I  is  the  visibility  (contrast)  of  the  sinusoidal  fringe  that  was  seen 
by  Michel  son.  Similarly 

l2(-x.y,t)  =  (Iq/2)  {exp[2a(x,y.t)]  +  exp[2a(-x,-y,t)] 

+  2  exp[a(x,y,t)  +  o(-x,-y,t)]  l7(2x,2y)l 

sin[^(2x,2y)  +  ^(x,y,t)  -  ^(-x,-y,t)]}  . 

(2-10) 

A  function  related  to  the  fringe  visibility  function  is  given  by 


^2  ~  12^"^'^'^^  ”  Ij(x,y,t) 

I2  +  '  i^FxTyTtr^rT^TxTyTtJ 
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One  of  the  major  advantages  of  the  amplitude  interferometer  over 
other  rotational  shearing  interferometers  is  the  suppression  of  the 
effects  of  the  scintillation,  a{x,y,t),  by  the  cosh[  ]  function  in  Eq. 
(2-11). 


In  the  absence  of  phase  errors,  (I2  -  Ij)/(l2  +  Ij)  of  Eq.  (2-11) 
yields  l7(2x,2y)l  sin[^(2x,2y)] ,  which  is  the  imaginary  part  of 
7(2x,2y).  Under  this  condition,  if  the  object  were  to  be  positioned  to 
one  side  of  the  optical  axis,  then  it  could  easily  be  reconstructed  by 
Fourier  transforming  the  imaginary  part  of  7(2x,2y)  and  discarding  one 
of  the  resulting  twin  images.  However,  the  phase  errors  /J(x,y,t) 
prevent  us  from  doing  this  when  Imaging  through  the  aberrations. 
Averaging  over  a  time  long  compared  with  the  fluctuation  time  of 
/l(x,y,t)  just  causes  (I2  -  Ij)/(l2  +  Ij)  to  average  out  to  zero. 

Suppose  we  gather  M  short  exposures  (frames),  each  of  duration  r, 
separated  by  time  At.  Further  suppose  that  the  total  collection  time, 
T  =  MAt,  is  many  times  the  correlation  time  of  the  phase  error.  Then 
one  way  to  extract  desired  quantity,  l7(2x,2y)l,  from  Eq.  (2-11)  is  as 
follows.  Ignoring  a(x,y,t),  we  can  square  Eq.  (2-11)  and  obtain 


h  ~  ^1 

l2^  II 


l7(2x,2y)l^  sin^[^(2x,2y)  +  ^{x,y,t)  -  ^(-x,-y,t)].  (2-12) 


Averaging  this  quantity  over  the  M  frames  gives 
I  -  I  1 

<  y-  T  >T  '  l7(2x.2y)|2  <sin2[^(2x.2y)  +  p(x.y.t)  -  />(-x,-y,t)]>^ 
»  2 
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s  l7(2x,2y)l^  M"^  sin^[^(2x,2y)  +  ^(x.y.mAt)  -  ^(-x.-y.mAt)] 

m=l 

=  l7(2x,2y) 1^/2  ,  (2-13) 

where  it  is  assumed  that  the  phase  error  p  varies  with  time  and  is 
uniformly  distributed  over  (-T,r)  over  the  time  interval  T.  Therefore 
a  reasonable  estimator  for  l7(2x,2y)l  is 

fl  -  I 

l7(2x.2y)l^  =  2  <  ^  (2-14) 

.  2  IJ 

Currie  [2. 1,2. 2]  proposed  using  the  quantity 

h(2x,2y)l  .  fF  (2-15) 

where 

AC  =  <1^  +  (2-16) 

and 

CC  =  2<Ij  l2>y  .  (2-17) 

Inserting  Eqs.  (2-16)  and  (2-17)  Into  Eq.  (2-15)  reveals  that  this 

yields 


l7(2x,  2y)l^ 


2  ^^^2  ' 

<(I2  +  Ii)\ 


(2-18) 


which  is  similar  to  the  estimator  given  in  Eq.  (2-14)  but  changes  the 
order  of  the  time  averaging  operation  and  the  division  operation. 
However,  as  will  be  seen  later,  the  performance  of  the  estimator  in  Eq. 
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(2-14)  can  be  shown  to  be  significantly  better  for  the  case  of  low 
light  levels. 

Alternatively  if  the  phase  error  ^(x.y.t)  is  constant  during  the 
total  integration  time  T,  then  the  fluctuations  in  p  cannot  be  employed 
to  cause  the  average  of  the  sin  [  ]  term  to  be  1/2.  Then  we  can 
achieve  the  same  effect  by  introducing  a  phase  plate,  with  spatially 
uniform  phase  0(t),  which  can  change  with  time,  in  front  of  one  half  of 
the  Koster's  prism.  Then  Eq.  (2-12)  is  replaced  by 

2 

=  l7(2x,2y)l^  sin^[^(2x,2y)  +  ^(x,y,t)  -  ^(-x,-y,t)  -  e(t)]  . 

(2-19) 

One  choice  of  5(t)  would  be  0  for  half  the  time  and  ir/2  for  the  other 

half  the  time.  Since  sin^(5  +  0)  +  s1n^(0  -  t/2)  =  sin^(0  )  + 

A  o  u  u 

cos^(5q)  =  1,  then 

I  -  I  1^ 

<1^1  >y  =  l7(2x,2y)|2  .  (2-20) 

k  2  Ik 

This  scheme  has  the  great  advantage  that  only  two  frames  of  data  need 

2 

taken  to  estimate  I7I  ,  and  this  maximizes  the  signal-to-noise  ratio 
for  a  given  total  number  of  photons,  as  will  be  seen  later.  Another 
possible  choice  for  5(t)  is  the  discrete  values  {0,  tl2,  r,  3ir/2}. 
Another  is  to  vary  0(t)  continuously  between  0  and  2w  radians,  while 
integrating  over  an  integer  number  of  frames  during  each  0  to  2r  cycle. 

Additional  estimators  of  I7I  can  be  obtained  by  averaging  then 
dividing,  i.e.  <(l2  -  I2)^>j/<(l2  +  h^^^T'  dividing  then 

averaging  as  was  assumed  above. 


^2  -  ^1 
I2 
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The  section  that  follows  treats  the  case  of  measurements  limited  by 

2 

photon  noise  in  which  case  different  estimators  of  I7I  can  have 
significantly  different  variances. 
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3.0  PERFORMANCE  AT  LOW  LIGHT  LEVELS 

In  this  section  we  examine  the  performance  of  the  amplitude 
interferometer  at  low  light  levels,  both  analytically  and  through 
computer  simulation.  Continuing  the  development  in  Section  2,  we 
provide  a  statistical  model  of  the  amplitude  interferometer  and  discuss 
a  method  for  obtaining  diffraction-limited  imagery  from  aberrated,  low 
light-level  measurements  of  the  mutual  coherence  function.  Our  basic 
approach  is  to  perform  a  sequence  of  measurements  from  which  samples  of 
the  modulus  of  the  mutual  coherence  can  be  estimated  and  then  to 
perform  phase  retrieval  to  recover  the  complex  mutual  coherence 
function.  The  recovered  samples  of  the  coherence  function  are  then 
Fourier  transformed  to  yield  an  image  of  the  object  intensity. 

The  organization  of  Section  3  is  as  follows.  In  Section  3.1,  we 
present  a  statistical  model  for  the  amplitude  interferometer  and 
discuss  three  methods  for  estimating  the  modulus  of  the  mutual 
coherence  from  low  light  level  amplitude  interferometer  measurements  in 
the  presence  of  aberrations.  The  first  two  methods,  [3.3],  which  are 
suitable  for  applications  in  which  the  aberration  in  slowly  varying, 
require  a  modification  of  the  amplitude  interferometer  as  shown  in 
Figure  3-1.  The  third  method,  proposed  by  Currie,  [3. 4, 3. 5],  can  be 
used  in  situations  where  the  aberrations  are  rapidly-varying  such  as 
aberrations  caused  by  atmospheric  turbulence.  In  Section  3.2,  we 
develop  a  lower  bound  on  the  mean-squared  error  in  estimating  the 
object  intensity  from  amplitude  interferometer  measurements,  using  the 
statistical  model  of  Section  3.1.  Finally,  Section  3.3  contains 
results  from  several  digital  simulations  and  image-reconstruction 
experiments.  As  one  might  expect,  the  quality  of  the  reconstruction 
depends  not  only  on  the  light  level,  but  also  on  the  content  of  the 
image.  The  more  specular  or  point-like  the  object  is,  the  better  the 
reconstruction;  diffuse  objects  are  the  most  difficult  to  reconstruct. 
These  observations  are  confirmed  by  the  digital  simulations  in  Section 
3.3. 
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Figure  3-1,  Schematic  Diagram  of  a  Modified  Amplitude  Interferometer. 
A  variable  phase  plate  has  been  added  to  allow  the 
Introduction  of  phase  term  0(t)  Into  the  measurements. 
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3.1  MEASUREMENT  MODEL 


The  amplitude  interferometer  measurements  are  assumed  to  consist  of 

a  sequence  of  pairs  of  two-dimensional  video  frames  which  are  the 

outputs  of  the  CCD  arrays  at  each  of  the  two  output  arms  of  the 

interferometer.  We  denote  these  measurements  as  (N:..,  N. where 

and  respectively  are  the  detected  output  energy  at  the 

(i,j)th  detector  element  and  the  kth  frame  of  the  left  and  right  output 

arms  of  the  interferometer.  At  low  light  levels  and  with  ideal 
1  2 

detectors,  N.jj^  and  N.jj^  consist  of  the  number  of  photon  events 
detected  over  each  detector  element  (i,j)  and  over  each  time  frame  k. 
The  counts  are  well -modeled  as  Poisson-distributed  random  variables 
[3.13,3.15]  with  mean  values 


^ijk  '  1  ^b]  ^ijk  *  1  I 


I^t) 


dt 


(3-1) 


■k-1 


k-l 


respectively,  where  Ig  models  contributions  due  to  background  light  and 
the  dark  current  of  the  CCD  arrays  and  [t|^_j,  tj^]  denotes  the  detector 

integration  interval  for  the  kth  frame.  l}j{t)  and  denote  the 

respective  instantaneous  intensities  at  the  output  of  the  two 
interferometer  arms. 

Expressions  for  and  I^j(t)  were  previously  derived  in 

Section  2.  Here  we  use  the  subscript  notation  to  emphasize  the  fact 
that  the  output  intensities  IjCx.y.t)  and  l2(-x,y,t)  of  Eqs.  (2-9)  and 
(2-10)  are  sampled: 

■  ^i*  y  ■  yj*  t)  dx  dy  ,  (3-2) 

Ax  Ay 
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where  (x,. ,  y.)  denotes  the  center  of  the  (i,j)th  detector  element  and 

1  J  2 

(Ax,  Ay)  is  its  area.  A  similar  relationship  holds  for  I^,-(t).  In  our 

w 

notation  for  the  discretized  mutual  coherence  function,  we  suppress  the 
fact  that  7  is  sampled  at  half  the  rate  that  the  output  intensities  1^ 
and  are: 

7ij  =  J  /  7(2x  -  2x^.,  2y  -  2yj)  dx  dy  .  (3-3) 

Ax  Ay 


This  reduction  in  sampling  rate  results  from  the  fact  that  incident 
field  components  (x^,  y^)  and  (-x^,  -yj)  are  interfered  to  obtain  the 
mutual  coherence  component  at  (2x^,  2yj).  This  difference  in  sampling 
rates  is  not  important  for  the  discussion  in  this  section,  however,  it 
plays  an  important  factor  in  the  determination  of  the  appropriate 
sampling  rates  in  the  Firefly  simulation  discussed  in  Section  4.3.  For 


simplicity,  we  assume  the  integration  interval  At  =  tj^  -  t|^_j  is  the 
same  for  each  frame.  We  also  assume  that  Ig  is  explicitly  known  and, 
for  simplicity,  that  it  is  constant  in  time  and  over  the  entire 


1  P 

aperture  plane.  The  intensity  parameters  and  are  possibly 

random  variables  due  to  the  stochastic  nature  of  the  phase  term  il.(t). 

12  ^ 

Therefore,  processes  such  as  and  are  typically  called  doubly- 
stochastic  Poisson  processes  [3.15].  By  this  we  mean  that,  conditioned 
on  A|j|^,  Njji^  is  a  discrete  random  variable  with  the  probability  mass 
function; 


A  similar  relationship  holds  for 

Assuming  an  ideal  beamsplitter  (ff  =  r/2  in  Eq.  2-4)  and  ignoring 

the  effects  of  scintillation,  the  instantaneous  output  intensities. 
12 

I^j(t)  and  Ijj{t)i  can  be  reexpressed  as: 
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l}j(t)  =  IqCI  -  l7ijl  sin[arg  7.^.  +  *,-j(t)]} 

I?j(t)  =  Iq{1  +  17^ j I  sin[arg  7^j  +  »^j(t)]}  , 

where  subscripts  1  and  2  denote  the  left  and  right  output  arms  of  the 
interferometer,  7^j  denotes  the  (discretized)  normalized  complex  mutual 
coherence  function  of  the  incident  field,  1^  denotes  the  average 
instantaneous  detected  energy  in  photons  per  second,  and  ♦^j(t)  denotes 
the  phase  difference  between  the  input  arms  of  the  interferometer  and 
can  include  both  random  and  non-random  contributions  from  fixed  or 
varying  system  aberrations  and  atmospheric  turbulence. 

We  assume  that  is  known  or  can  be  accurately  determined  from  the 
measurements.  This  is  not  an  unrealistic  assumption  since,  by  Eq. 
(3-1)  can  be  estimated  by  forming  the  sum, 

^o  =  “F -  ^  Wik  +  -  In  .  (3-6) 

°  2n2  K  At  ijk  ® 

2 

Here,  N  denotes  the  total  number  of  pixels  in  each  of  K  pairs  of 

A  2 

frames  in  the  data  collection.  1^  is  based  upon  N  K  independent 
measurements  and  its  variance  decreases  as  the  number  of  frames  or 
pixels  increase. 

Our  approach  to  image  reconstruction  from  amplitude  interferometer 
measurements  will  be  to  form  an  estimate  of  the  modulus  of  the  mutual 
coherence  function  l7^jl  and  perform  phase  retrieval  to  recover  the 
phase  of  the  coherence  function,  7^j  =  l7^jl  exp{arg  7^j)i  from  its 

modulus.  A  reconstructed  image  is  then  formed  by  inverse  Fourier 
transformation  of  the  coherence  function.  An  estimator  for  17.^1  can 
be  determined  given  the  model  described  above.  A  reasonable  estimate 

A 

is  to  choose  the  values  l7^jl  which  are  most  likely  to  have  resulted  in 
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the  measurements  estimate  is  obtained  by 

maximizing  the  logarithm  of  the  probability  of  the  measurements, 
^ijk^'  respect  to  This  approach,  called  maximum- 

likelihood  estimation,  has  several  desirable  features  which  are 
mentioned  in  [3.15].  For  Poisson-distributed  random  variables,  the 
logarithm  of  the  probability  distribution,  denoted  1(7),  is 


U7)  =  -r  (Ajjk  *  Afj  J  *  Z  log(A|j  J  n‘j^  *  r  log(Ay  +  C 

(3-7) 


2 

where  C  is  a  constant  which  is  independent  of  I 7a.- I  .  The 

*  w 

likelihood  estimator  for  l7^-jlt  if  it  exists,  is  then  a  soluti 


I  maximum- 
on  of  the 


equation 


81 

8n,j 


.  y  Ilili  °*11k  ^  y  Ililk  °*i  ik 

■  4-  .1  ei7,.i  4-  .2  Bi7,,i 


(3-8) 


Equation  (3-8)  is  nonlinear  in  l7jjl  and  is  generally  difficult  to 
solve.  Moreover,  no  information  has  been  specified  about  ♦Tj(T).  In 
the  subsequent  discussion,  we  examine  three  estimators  for  17.^1  for 
the  cases  where: 


1.  *'ij(T)  is  constant  over  each  of  K  intervals 

2.  ♦,-j(T)  varies  linearly  over  the  collection  period,  and 

3.  '^^j(f)  contains  a  phase  term  due  to  atmospheric  turbulence  and 
changes  rapidly  over  the  collection  period. 


In  the  first  two  cases,  we  assume  that  the  phase  term  ♦^j(t)  is  given 

by 


♦ij(t)  *  A;^j(t)  +  e{t) 


(3-9) 
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where  =  ^(x.y.t)  -  ^(-x,-y,t)  from  Eq.  (2-9),  and  e(t)  is  a 

user-controlled  phase  term  introduced  into  the  amplitude 
interferometer.  One  method  of  incorporating  such  a  phase  term  is  to 
place  a  variable-phase  plate  over  one  of  the  input  arms  of  the 
interferometer  as  shown  in  Figure  3-1.  In  the  third  case,  which  is 
discussed  somewhat  at  the  end  of  Section  2,  we  assume  that  is 

given  by 

♦  ij.(t)  =  A/?ij.(t)  ,  (3-10) 

where  A^^.j(t)  is  the  phase  difference  introduced  by  atmospheric 
turbulence  as  described  in  Eq.  (2-9).  In  the  discussion  to  follow  we 
assume  that  the  phase  term  ♦^j(t)  constant  during  any  integration 
interval  [t|^_j,t|^]  and  denote  it  as 

3.1.1  Discrete  Stepped-Phase  Systems 

Consider  a  stepped-phase  system  in  which  P(t)  in  Eq.  (3-9)  is 
constant  over  each  of  K  intervals  of  length  At  =  T/K,  where  T  is  the 
total  collection  period,  and  denote  its  value  by  k  =  1,  ...,  K. 
Here  we  assume  that  A/J.^(t)  is  constant  over  T:  A/?.^(t)  =  A/?. 

ij  IJ  IJK 

te[t,^_j,t|^].  Define 

g,jl<  =  l7,jl  Sln[arg  7,j  *  .  (3-11) 

1  2 

Then  and  become,  using  (3-1),  (3-5),  and  (3-9), 

1  ^o’’’  2 

^ijk  ^  ~r  '  9ijk^  ^ijk  "  ~  ,  (3-12) 

where 

c  =  1  +  Ig/I^  .  (3-13) 
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If  is  chosen  to  satisfy 


1  K 

IZ  sin[arg  7^j  +  A/J^j  +  =  0  for  K  >  2, 

i  E  sin^targ  7,j  +  4/),j  *«,,]=  5  .  (3-14) 


we  have  that 


I  1 2  2  2 

•'J^ij'  "  K  V 


(3-15) 


Thus  the  motivation  for  introducing  the  controllable  phase  term  into 

the  interferometer  is  that  for  a  suitable  sequence  0^.,  k  -  1,  K, 

?  1  2  ^ 

one  can  determine  I7I  from  regardless  of  the  aberration 


LP,y 


One  could  consider  the  two-step  process  of  first  computing  the 

maximum-likelihood  estimate  of  g^jj^  and  then  estimating  from 

g?.. ,  using  the  above  equation.  Maximization  of  Eq.  (3-7)  with  respect 
1 J  KA  2 

to  g^ji^  is  much  simpler  and  the  maximum-likelihood  estimate  of  g.jj^  is 

given  by 


.2  ^Tik  '  ^i  ik 

.'^ijk  '^ijk. 


(3-16) 


where  c  is  given  by  Eq.  (3-13).  The  resulting  estimator  for  the 
squared  modulus  is  then 
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2^  [n?,,  - 

If  ^  7  1 


(3-17) 


We  refer  to  this  as  "discrete  estimator  1"  (Dl).  For  each  pixel  (i,j) 

2  1 

and  each  frame  k,  the  quantity  -  N.^j^  is  normalized  by  the  total 

number  of  counts  detected  within  the  pixel  and  frame,  N?j|^  +  One 

might  also  consider  performing  the  normalization  operation  after  frame 
averaging;  this  results  in  two  other  estimators  which  we  refer  to  as 
"discrete  estimator  2"  (D2) : 


- ^ - 2  ^  Kik  -  ^likl 


(3-18) 


and  "discrete  estimator  3"  (D3); 


2  ^  Kjk 

■ 


jk  ^ijk) 


(3-19) 


At  extremely  low  light  levels  there  is  a  bias  term  proportional  to  l/I^ 
which  is  present  in  all  three  estimators.  To  account  for  this  bias, 
correction  terms  can  be  incorporated  into  the  estimators.  Bias- 
corrected  (BC)  versions  of  these  estimators  are  given  by 


Dl-BC: 


2c^  ^  Kik  - 
'  ^  Kjk  *  «ljk]^ 


Kik  *  "!ik1 


ijk  "  -^ijkJ 


(3-20) 
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D2-BC: 


?  [Kjk  -  "IjJ  -  Kjk  *  "uk]]  •  (3-21) 


2K(y)^  k 


where  1^  is  given  by  (3-6) 


and  D3-BC; 


2  ? 

’  ^  Kjk  *  -Ijk)' 


Kjk  -  '*!jk) 

Kjk  *  "Ijk) 


(3-22) 


3.1.2  Continuous-Phase  Systems 


Another  possibility  is  that  the  controlled  phase  term  0(t)  of  Eq. 
(3-9)  varies  linearly  from  0  to  2ir  as  t  goes  from  to  t^  +  T:  B{1)  = 
2)r(t  -  tQ)/T,  t^  S  t  S  t^  +  T.  In  this  case,  can  be  recovered 
from  a  sequence  of  four  frames,  each  with  integration  time  equal  to 
T/K,  K  =  4.  Let 


to*T/4 


t„*T/2 


=  f  /  -  l|j(‘)]  =  i-  /  [ifj(t)  -  l}j(t)]  dt, 


t„.T/4 


t„.3T/4 


*  I.  /  [*1j(**  ■  'lj(**]  “ij  °  I„  J  [’lj(*)  ■  ’ij(l)]  ‘‘I- 


ton/2 


10*31/4 


(3-23) 


Then,  from  Eqs.  (3-1)  and  (3-5), 


Iq  -  C^j)^  +  (B^j  -  D^j)^ 


(3-24) 
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Arguments  similar  to  those  of  Section  3.1.1  can  be  used  to  derive 
another  two  estimators.  Let 


(3-25) 


(3-26) 


(3-27) 


Then  "continuous  estimator  2"  (C2)  is 


^  2  2 

'  (*IJ  -  'ij)  *  [hi  -  Ofj]  •  (3-28) 

At  low  light  levels  both  of  these  estimators  have  a  bias  term  which  is 
proportional  to  1/Iq*  As  in  the  previous  section,  bias  correction 
terms  can  be  added  to  reduce  the  bias  of  these  estimators. 


29 


3.1.3  Phase  Diversity  From  Atmospheric  Turbulence 

The  third  possibility  we  consider  is  that  the  phase  error  term  is 

®^j(t)  caused  by  atmospheric  turbulence.  Here  the  frame  integration 

time  At  is  assumed  to  be  short  enough  that  the  phase  errors  within  each 

frame  are  essentially  constant.  As  discussed  earlier  in  Section  2, 

this  requirement  limits  At  to  be  less  than  or  equal  to  the  coherence 

time  of  the  atmosphere.  Typical  values  for  the  coherence  time  of  the 

atmosphere  in  the  optical  regime  vary  between  5  and  20  ms.  Assuming 

that  the  phase  error  is  constant  over  each  frame,  ♦••(t)  =  ♦...  for 
r  _  ^  J  1 J  I'- 

te[tk_i,  tkJ,  we  can  use  the  discrete-phase  estimators  discussed  in 

Section  3.1.1.  When  k=l,  ...  K  is  uniformly  distributed  over  the 

interval  [-r,  r]  we  then  have,  in  the  limit  for  large  K, 


k  g  sln(arg  r  0 

1  K  . 

K  lij  *  *ijk>  =  5 


(3>29) 


and  conditions  (3-14)  are  satisfied. 
3.2  ESTIMATOR  PERFORMANCE 


Here  we  examine  the  performance  of  the  estimators  described  in 
Section  3.1.  An  important  measure  of  performance  which  we  focus  on  is 
the  root  mean-squared  error.  In  Section  3.2.1,  we  derive  asymptotic 
expressions  for  the  bias  and  squared  error  which  are  valid  at  moderate 
to  high  light  levels.  The  low  light  level  performance  of  the 
estimators  is  determined  by  the  use  of  Monte  Carlo  simulation.  In 
Section  3.2.2,  we  derive  a  lower  bound  on  the  expected  image- 
reconstruction  error.  An  important  feature  of  the  bound  is  that  it 
accounts  for  the  object  support  constraint  which  is  Imposed  in  the 
reconstruction  algorithm. 
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3.2.1  Estimator  Bias  and  Squared  Error 

Asymptotic  expressions  for  the  normalized  bias  (NB) ,  normalized 

standard  deviation  (NSD)  and  normalized  root  mean-squared  error  (NRMSE) 

of  several  of  the  squared-modulus  estimators  were  derived  with  the  aid 

of  the  symbolic-computation  program  MAPLE  [3.16].  For  a  given  squared- 

2 

modulus  sample  l7^jl  •  the  NB  and  NSD  are  defined  as 

NB  .  (3-30a) 

and 

NSD  .  [e{(|;,j|2  .  E{li,j|2}]^}]'^^l7,j|2  ,  (3-30b) 

where  E{«)  denotes  expectation.  The  NRMSE  can  be  computed  from  the  NB 
and  NSD  as 


NRMSE  =  InB^  +  NSD^  .  (3-30c) 

The  expressions  for  the  NB,  NSD  and  NRMSE  of  each  of  the  four 
estimators  are  complicated  functions  of  the  parameters  17^. . I,  1^,  T,  K, 
and  Ig,  and  are  therefore  omitted  here.  The  expressions  for  the 
unnormallzed  versions  of  these  quantities  and  details  of  their 
derivation  can  be  found  in  Appendix  A.  We  plot  the  NB,  NSD  and  NRMSE 
as  a  function  of  I^T,  since  I^T  Is  the  average  number  of  photons 
detected  during  time  T  in  a  single  detector  element  (i,j)  at  the  output 
of  one  of  the  output  arms  of  the  Interferometer.  The  estimate  l7^jl^, 
however,  is  based  upon  an  average  of  21 J  photons,  since  it  is  based 
upon  the  counts  detected  in  both  arms  of  the  Interferometer. 

is^also  of  interest  to  consider  the  mean-squared  error  of  the 
modulus  l7^jl.  Considering  only  the  leading  terms  In  the  mean-squared 
error  given  in  Eqs.  (A-3)  and  (A-8)  1n  Appendix  A  (i.e.,  moderately 
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high  light  level),  c=l  (i.e.  no  bias  exposure)  and  K=2  frames,  the 
mean-squared  error  of  is  217^^. By  algebraic  manipulation 
it  can  be  shown  that  this  implies  that  the  mean-squared  error  of  17^.^. I 


HSE  {I7ijl}  = 


(3-31) 


Note  that  this  first-order  approximation  to  the  mean-squared  error  of 
I7.  .1  is  independent  of  the  value  of  l7,-,-l» 

I  J  * 

Plots  of  the  expressions  for  NB  and  NSD  in  Figures  3-2a  and  3-2b 

show  the  relative  contributions  to  the  NRMS  error  due  to  bias  and 

standard  deviation  respectively,  for  each  of  the  four  estimators  with 

17^. j  I  =  0.2,  Ig  *  0.2  Iq  and  I^T  varying  from  10  to  1000  photons.  For 

the  D1  and  D2  estimators,  the  photon  collection  was  divided  into  two 

frames,  with  =  0  and  fig  “  ir/Z,  whereas  for  the  Cl  and  C2  estimators, 

four  frames  were  required.  As  expected,  the  estimator  performance 

improves  as  I^T,  the  average  total  number  of  photons  collected  per 

detector  element,  increases.  The  bias  and  standard  deviation  of  the  D1 

and  D2  estimators  are  nearly  identical.  A  similar  trend  is  observed 

for  the  Cl  and  C2  estimators.  The  D1  and  D2  estimators,  which  were 

based  on  the  discrete-phase  system,  perform  better  than  the  continuous- 

phase  system  Cl  and  C2  estimators.  For  all  four  estimators,  however, 

the  NRMS  error  is  dominated  by  the  standard  deviation  of  the  estimator, 

2 

which  has  a  strong  dependence  on  •7^jl  •  This  is  due  to  the  fact  that 

the  estimators  are  trying  to  determine  the  squared  difference  between 

the  means  of  the  two  Poisson  random  variables,  and  N?j|^.  This 

difference  is  directly  proportional  to  17^^ I  [see  Equations  (3-1)  and 

(3-5)],  and  as  the  value  of  17^4!  decreases,  the  average  difference 
12 

between  N^jj^  and  N.jj^  diminishes,  causing  the  standard  deviation  of  the 
estimate  to  rise  dramatically.  Thus,  although  bias  corrections  can  be 
easily  incorporated  into  the  estimators,  they  will  improve  the 
estimator  performance  only  slightly. 
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Figure  3-2.  (a)  Bias  and  (b)  Standard  Deviation  of  the  Squared-Modulus 

Estimators  as  a  Function  of  the  Average  Number  of  Photons 
per  Detector  Element  (IT)  for  I7I  =  0.2  and  Ig  =  0*2Iq. 


Another  measure  of  performance  is  the  number  of  photons  required  to 
achieve  a  specified  NRMS  error  In  estimating  a  given  squared-modulus 
sample,  This  is  Illustrated  In  Figure  3-3  for  the  D2  estimator 
with  K  =  2  frames  and  I„  =  0.2r.  To  achieve  a  NRMSE  of  0.1  when  I7.  .1 

D  O  IJ 

=  0.25,  it  would  require,  on  average,  7700  photons  per  detector.  To 
achieve  a  NRMSE  of  0.5  when  17^.^!  =  0.5,  however,  requires  only  80 
photons  per  detector.  On  the  other  hand.  If  an  average  of  2000  photons 
is  collected  In  each  detector  element,  then  the  NRMS  error  In 
estimating  modulus  values  which  are  greater  than  0.5  is  less  than  10 
percent,  while  the  error  in  estimating  modulus  values  which  are  less 
than  0.1  is  greater  than  50  percent.  In  general,  this  would  imply  that 
the  performance  is  better  for  objects  which  consist  of  a  small 
collection  of  points,  where  the  mutual  coherence  modulus  samples  are 
relatively  large,  than  on  extended  objects,  for  which  the  mutual 
coherence  values  are  small  at  higher  spatial  frequencies.  At  extremely 
low  light  levels,  the  expressions  derived  for  NB  and  NRMSE  are  not 
accurate  since  they  are  based  on  low-order  asymptotic  expansions  in 


1/IjjT.  Investigations  of  the  estimator  performance  in  the  low  light 
regime,  I^At  <,  10  photons,  were  carried  out  by  the  use  of  Monte  Carlo 


simulation.  At  each  light  level, 
realizations  of  the  output  of  a 


I  At,  and  visibility  level,  7,  1,000 

^  12 
single  pair  of  detectors  (N|j|^, 


k=l,  ...,  K  was  simulated.  Each  of  the  three  discrete  estimators,  Dl, 


D2,  and  03  was  applied.  Then  the  estimator  bias  and  squared  error  were 


then  determined  from  the  sample-mean  and  sample  variance  of  the 


estimates. 


Two  scenarios  were  considered.  In  the  first  scenario,  a  spaced- 
based  interferometer  was  assumed  and  a  K=2  frame  data  collection 
=  0  and  $2  ~  ir/2  in  Eq.  3-11)  was  simulated.  In  the  second 
scenario,  a  ground-based  interferometer  was  assumed  and  a  K=20,000 
frame  collection  with  a  uniformly-distributed  phase  error  term  was 
simulated.  Figure  3-4  shows  the  RMS  error  in  the  modulus  estimate 
l7^jl  (i.e.,  the  square  root  of  l7^jl  )  for  the  two-frame 
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-3.  The  Number  of  Photons  Required  per  Detector  Element  to 
Achieve  a  Specified  NRMSE  for  the  DP  Estimator,  K  =  2 
Frames,  In  =  0.21-  and  NRMSE  =  0.1,  0.2,  0.5. 
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Figure  3-4.  RMS  Error  of  the  Modulus  Estimate  17^.1  for  a  Two  Frame 
Collection.  ^ 


Figure  3-5.  RMS  Error  of  the  Modulus  Estimate  I7. J  for  a  20,000  Frame 
Collection. 
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collection  at  a  range  of  light  levels  KI  At  and  fringe  visibilities 
I7I.  Note  that  the  RMS  error  of  l7^.jl  is  nearly  independent  of 
approximately  l/rZl  ,  as  predicted.  In  Figure  3-5  the  RMS  error  of 
I7. .1  is  shown  for  the  K  =  20,000-frame  collection.  Comparing  the  two 

*  J 

cases,  we  see  that  about  three  orders  of  magnitude  more  photons  are 
required  in  the  20,000-frame  collection  to  achieve  a  performance 
comparable  to  that  of  the  two-frame  collection. 


3.2.2  Lower  Bounds  on  Image  Reconstruction  Error 


Asymptotic  expansions  and  Monte  Carlo  simulations  were  used  in 

Section  3.3.1  to  derive  explicit  expressions  and  plots  of  the  error  in 

^  2 

estimating  the  Fourier  intensity  components  l7,-J  .  In  assessing  the 

*  J 

performance  of  the  image  reconstruction  algorithm  described  in  Section 
3.3,  this  approach  is  not  feasible  since  the  algorithm  is  iterative  and 
nonlinear.  Our  approach  here  is  to  lower  bound  the  image 
reconstruction  error.  In  this  Subsection  we  present  lower  bounds  on 
the  image  reconstruction  error  for  the  case  of  image  reconstruction 
from  amplitude  interferometer  measurements.  The  bounds  derived  here 
are  independent  of  the  procedure  used  to  reconstruct  the  image  and  thus 
represent  the  best  possible  performance  of  any  such  estimator.  These 
bound  allow  a  means  of  comparing  a  wide  variety  of  reconstruction 
algorithms  against  some  "best  possible"  performance  standard. 


We  will  denote  the  object  intensity  as  f{x,y),  where  we  assume  that 
f(x,y)  I  0  for  all  X  and  y  in  the  field  of  view  and  that  f  has  finite 
support.  This  allows  f  to  be  described  by  samples  of  its  Fourier 
transform,  which  we  represent  in  this  case  by  7^j.  By  the  use  of 
Parseval's  theorem,  we  can  then  represent  the  squared  error  between  f 


and  an  estimate,  say  f,  as  a  function  of 


’u 


and  its  estimates 


/  J  lf(x,  y)  -  f(x,  y)I^  dx  dy 


ij 


'“'Ij 


(3-32) 
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A 

Our  strategy  is  to  develop  lower  bounds  on  the  error  term 
.2 


'^ij' 


Appendix  B  contains  a  derivation  of  a  Cramer-Rao  (CR)  type  lower 

bound  [3.15,3.21]  which  incorporates  side  information.  In  the  present 

application  the  side  information  incorporated  into  the  reconstruction 
algorithm  is  the  support  of  the  object  and  its  nonnegativity;  both  are 
incorporated  into  the  algorithm  described  in  Section  3.3. 

R  T 

Let  7^.j  =  7^j  +  i7|ji  where  the  non-subscripted  i  =  rT'.  For 

convenience,  we  will  represent  the  complex  mutual  coherence  samples 

2 

7,-,-i  if  j  =  li  •••!  N,  by  the  2N  -length  real  vector 

~  "^^00'  ‘^^01*  '^01'  ***  ‘^ij’  •••]  *  (3-33) 

A  ^ 

Denote  the  estimate  by  7.  For  simplicity  we  assume  7  is  unbiased.  In 
Appendix  B  a  more  general  result  is  derived  for  biased  estimators.  The 
CR  bound  of  Appendix  B  can  be  expressed  as  (c.f.  Theorem  1  of  Appendix 
B) 


e{(7  -  7)  (7  -  7)^}  ^  P)"  P 

=  Q  (3-34) 

where  J<y  is  the  Fisher  information  matrix  of  7,  defined  by  Eq.  (13)  of 
Appendix  B,  T  denotes  matrix  transposition,  (+)  denotes  the  Moore- 
Penrose  pseudo  inverse  (c.f.  Eq.  (9),  Appendix  B)  and  P  and  Q  are 
projection  matrices  which  depend  on  the  object  support  (c.f.  Eqs.  (38) 
and  (50)  in  Appendix  B).  In  (3-34),  Q  reflects  the  amount  of 
improvement  afforded  by  the  use  of  the  support  constraint.  A  bound  on 
the  total  or  absolute  mean  squared  error  of  the  image  reconstruction 
can  then  be  found  by 
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Z  '7ij  -  7ijl^  =  tr[E{(7  -  7)  (7  -  7)^}] 

^  tr{Q  J-1}  , 


(3-35) 


where  tr[«]  denotes  the  matrix  trace  operation. 


The  bound  in  (3-35)  is  directly  applicable  m  the  case  where  the 
aberration  li  J  =  !•  N,  k  =  1,  K,  is  fixed  and 
nonrandom.  If  are  unknown  or  random  they  are  referred  to  as 
nuisance  parameters.  When  nuisance  parameters  are  present,  calculation 
of  an  error  lower  bound  is  more  difficult.  One  approach  for  the  case 
of  random  nuisance  parameters  is  to  determine  the  minimum  lower  bound 
for  the  worst  case  nuisance  parameters;  such  a  bound  is  called  a  minmax 
lower  bound.  Another  approach  which  is  available  when  the  distribution 
of  the  nuisance  parameters  is  known  is  to  derive  the  Fisher  information 
matrix  3  of  the  augmented  vector  (7,  #),  where  #  is  the  lexicographical 
ordering  of  into  a  real -valued  KN^  length  vector  as  in  (3-33). 
One  can  then  form  a  bound  similar  to  (3-35)  based  on  D.  3  has  the  form 


(3-36) 


where  for  instance  Jf  is  the  Fisher  information  associated  with  the 
nuisance  parameters.  The  lower  bound  then  takes  the  form 

E{(7  -  7)  (7  -  7)^}  2  p(p[j^  -  Oi*  J,^]p)*  p  .  (3-37) 
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Equation  (3-35)  can  then  be  Interpreted  as  the  first  term  in  a  series 
expansion  of  (3-37).  Note  that  (3-35)  and  (3-37)  are  equivalent  when 
‘^7#  ■  ‘^#7  ■  nuisance  parameters  are 
orthogonal  to  the  parameters  of  interest  7. 

We  derive  the  bound  of  (3-35)  for  the  amplitude  interferometer 
image  reconstruction.  In  light  of  the  discussion  above,  this  bound  may 
be  overly  optimistic,  but  it  should  give  an  indication  of  the  order  of 
magnitude  of  the  expected  image  reconstruction  error.  A 
straightforward  calculation  using  Eq.  (13)  of  Appendix  B  for  the  Fisher 
information  matrix  and  Eq.  (3-7)  for  the  likelihood  function  yields 


where 


(3-38) 


(3-39) 


hl2  k21 

^ij  ■  ‘’ij 


=  E  Z 
k 


cos(*iik)  ^^^^*iik) 

l7,jl^s1nVg7ij 


(3-40) 


(3-41) 
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Here  diag{B.j}  indicates  a  diagonal  matrix  with  blocks  along  its 
diagonal.  Also,  recall  that  c  =  1  +  I^/Ig  and  is  given  by  either 
(3-9)  or  (3-10).  Calculation  of  Q  is  also  straightforward  but  we  omit 
the  details  here.  Let  S  be  the  Fourier  transform  of  the  support 
constraint.  Then 


'’^ij  "  ,^.^1-1'J-j'  '^vy 


(3-42) 


This  relationship  is  expressed  more  compactly  as 

[I  -  C]  7  =  R7  =  0  ,  (3-43) 

2  2 

where  I  is  the  2N  x  2N  Identity  matrix  and  C  Is  a  symmetric  block- 
circulant  matrix  with  entries  j.j.*  Q  then  becomes 

Q  =  I  -  J“^  r[r  r]  R  (3-44) 

and  the  right-hand  side  of  (3-34)  is 

Q  *^7^  •  ^3-45) 

Calculation  of  the  squared-error  lower  bound  of  (3-35)  requires  the 
evaluation  of  (3-40)  through  (3-45)  which  can  be  accomplished 
numerically.  As  a  simple  example  though,  consider  the  case  where  no 
support  constraint  Is  In  use,  Q  =  I,  one  frame  Is  collected,  K  =  1,  and 
where  takes  on  the  values  0  and  t/2  with  equal  probability.  Then 

^ii  =  =  0  in  (3-40)  and 

w 
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bff  =  j  -5 - -  =  I - - 2 

’J  2c2.  I7..l2s1n2(arg7ij)  -  [j].f 

Substituting  (3-46)  into  (3-39),  (3-40),  (3-35)  and  (3-32)  results  in 
J  J  lf(x,  y)  -  f(x,  y)|2  dx  dy  ^  [c^  -  (3-47) 

We  see  that  the  absolute  squared  error  is  inversely  proportional  to  the 
average  light  level  per  collection  frame,  I^At,  and  is  directly 
proportional  to  the  difference 

-  l7(j|2  =  1  +  -  I7,jl^  (3-48) 

This  bound  increases  as  the  background  light  level  Ij,  increases  or  as 
the  squared  modulus  decreases:  either  change  causes  a  decrease 

in  the  measurable  fringe  contrast.  Related  error  behavior  is  seen  in 
the  digital  simulations  in  Section  3.3.  In  Section  3.3.2  we  observe 
that  diffuse  objects,  those  which  have  smaller  fringe  visibility  values 
I7.J.I,  are  more  difficult  to  reconstruct  than  objects  which  contain 
specular  or  glinty  components. 

.3.3  DIGITAL  SIMULATION  EXPERIMENTS 

Once  the  squared-modulus  of  the  mutual  coherence  has  been 
estimated,  an  image  of  the  object  intensity  can  be  determined  by  using 
the  fact  that  the  mutual  coherence  Is  just  the  Fourier  transform  of  the 
object  image  intensity.  Therefore,  reconstruction  of  the  object 
intensity  from  the  squared  modulus  of  the  mutual  coherence  function 
requires  the  retrieval  of  the  phase  of  the  mutual  coherence  function. 
This  phase  retrieval  can  be  accomplished  with  the  iterative  Fourier 
transform  (IFT)  algorithm  [3. 6, 3. 7, 3. 8]  using  positivity  and  support 
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constraints.  The  IFT  algorithm  is  closely  related  to  the  Gerchberg- 
Saxton  algorithm  [3.11].  Estimates  of  the  object  support  are  formed 
from  the  estimate  of  3s  follows:  (i)  is  inverse  Fourier 
transformed  to  provide  an  estimate  of  the  autocorrelation  of  the 
object,  (ii)  the  autocorrelation  estimate  is  then  thresholded  to 
provide  an  estimate  of  the  support  of  the  autocorrelation  of  the 
object,  (iii)  an  initial  estimate  of  the  object  support  is  formed  from 
the  autocorrelation  support  by  using  a  triple-intersection  rule 
[3. 2, 3. 9].  This  initial  object  support  depends  on  thresholded  values 
and  thus  may  exclude  parts  of  the  actual  object.  Hence  as  the 
iterations  progress,  the  support  constraint  is  enlarged  by  including 
neighboring  pixels,  thus  ensuring  that  the  whole  object  is  eventually 
contained  within  the  support  constraint.  Each  iteration  of  the  IFT 
algorithm  consists  of  the  following  four  steps,  as  illustrated  in 
Figure  3-6:  (i)  the  current  object  Intensity  estimate  is  Fourier 
transformed  to  produce  an  estimate  of  the  Fourier  transform  of  the 
object,  (ii)  the  modulus  of  the  Fourier  transform  is  replaced  by  the 
estimate  of  (iii)  the  result  is  inverse  Fourier  transformed; 
(iv)  the  object-domain  constraints  of  positivity  and  support  are 
enforced  using  the  hybrid  input-output  algorithm  in  conjunction  with 
the  error-reduction  algorithm  [3. 6, 3. 7, 3. 8] . 

We  performed  a  number  of  simulation  experiments  to  determine  the 
performance  of  the  IFT  algorithm  for  image  reconstruction  from  low 
light  levels.  Three  distinct  series  of  simulation  experiments  were 
performed.  Initially,  a  series  of  simple  simulations  was  performed  to 
determine  the  robustness  of  the  IFT  algorithm  with  respect  to  Fourier 
modulus  error.  Independent  and  identically  distributed  Gaussian  noise 
was  added  to  each  Fourier  modulus  sample  to  approximate  the  type  of 
measurement  error  that  might  occur  with  the  amplitude  interferometer. 
It  was  found  that,  for  the  diffuse  object  used  in  the  simulation,  a 
useful  reconstruction  was  obtained  even  at  noise  levels  which  gave  a 
Fourier  modulus  error  of  25%.  This  is  described  in  Section  3.3.1.  The 
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Figure  3-6.  Block  Diagram  of  the  Iterative  Fourier  Transform 
Algorithm. 
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second  series  of  simulation  experiments  was  performed  to  demonstrate 
the  performance  of  the  discrete  stepped-phase  system  described  in 
Section  3.1.1  for  the  case  of  a  two-frame  collection,  one  frame  with 
5(t)  =  0,  te[0,  T/2]  the  other  with  fi(t)  =  t/2  te[T/2,  T] .  To 
demonstrate  the  object-dependent  performance  of  the  imaging  system, 
three  distinct  objects  were  used:  a  simple  object  consisting  of  four 
equal ly-bright  points,  one  being  four  times  the  area  of  the  other 
three;  a  satellite  which  had  both  specular  and  diffuse  components;  and 
a  completely  diffuse  image  of  a  simulated  post-boost  vehicle  (BUS)  with 
several  attached  re-entry  vehicles  (RV's)  and  one  detached  RV.  The 
general  trend  we  observed  was  that  the  specular  objects  were  easier  to 
reconstruct  and  that  reasonable  reconstructions  were  obtained  with  much 
less  light  for  specular  objects  than  for  diffuse  objects.  This  is 
described  in  Section  3.3.2.  In  the  final  series  of  simulation 
experiments  we  simulated  a  ground-based  amplitude  interferometer  which 
used  the  effects  of  turbulent  atmosphere  to  provide  phase  diversity  as 
described  in  Section  3.1.3.  The  goal  was  to  demonstrate  the 
performance  of  the  amplitude  interferometer  imaging  system  for  the 
Firefly  experiment.  Simulations  were  performed  for  two  cases:  a 
collection  using  the  48"  Cassegrain  telescope  facility  at  Goddard  and  a 
collection  using  the  ISTEF  24"  Cessegrain  telescope.  This  is  described 
later  in  Section  4.3. 

3.3.1  Initial  Simulations  with  Noisy  Modulus  Data 

Clearly,  the  quality  of  the  reconstructed  image  has  a  strong 
dependence  on  the  accuracy  of  the  squared  modulus  estimate, 

Since  the  IFT  algorithm  is  iterative  and  highly  nonlinear,  it  is 
difficult  to  derive  analytically  the  performance  of  the  IFT  as  a 
function  of  error  in  the  modulus  estimate.  Empirical  simulation 
studies  have  shown,  however,  that  the  algorithm  is  robust  under  certain 
types  of  Fourier  modulus  error  [3.5]. 
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As  an  initial  assessment  of  the  viability  of  the  IFT  algorithm  for 
image  reconstruction  from  noisy  Fourier  modulus  data,  we  performed  a 
series  of  simulations  in  which  Gaussian  noise  of  varying  intensities 
was  added  to  the  Fourier  modulus  of  a  diffuse  object.  This  was  done  to 
approximate  the  types  of  Fourier  modulus  error  one  could  expect  from 
the  estimators  discussed  in  Section  3.1.  The  IFT  algorithm  was  then 
used  to  reconstruct  an  image  from  each  simulated  noisy  Fourier  modulus 
data  and  the  normalized  root  mean-squared  error  of  the  reconstruction 
was  compared  to  the  error  in  the  Fourier  modulus  data  which  was  induced 
by  the  added  Gaussian  noise.  Figure  3-7  shows  the  sequence  of 
reconstructed  images  along  with  the  original  object  used  in  the 
simulation.  Figure  3-8  shows  the  corresponding  sequence  of  Fourier 
modulus  data.  Gaussian  noise  with  variances  of  400,  IK,  2.5K,  lOK, 
40K,  lOOK,  300K,  IM,  3M,  lOM,  and  30M  was  added  to  the  modulus  data  to 
obtain  the  Fourier  moduli  shown  in  Figure  3-8  (b)  through  (1).  For 
reference,  the  peak  of  the  Fourier  modulus  at  DC  was  187,793.  Figure 
3-9  shows  a  plot  of  the  reconstructed  image  NRMSE  versus  the  NRMS 
Fourier  modulus  error.  The  image  reconstruction  error  appears  to  be 
linear  with  the  Fourier  modulus  error  with  an  error  of  approximately 
4.5%  for  the  case  where  no  noise  was  added  to  the  modulus.  The  small 
image  reconstruction  error  which  occurs  at  zero  Fourier  modulus  error 
is  most  likely  due  to  the  "stripe  artifact"  discussed  in  Ref.  [3.8]. 
Above  25%  NRMS  Fourier  modulus  error,  the  reconstruction  had  an  error 
of  more  than  35%,  and  the  object  was  barely  discernible. 

3.3.2  Simulations  of  a  Space-Based  Amplitude  Interferometer 

In  the  second  series  of  simulation  experiments  we  investigated  the 
performance  of  the  amplitude  Interferometer  assuming  the  discrete 
stepped-phase  system  described  in  Section  3.1.1,  and  the  estimator  D3- 
BC  of  Eq.  (3-22).  The  number  of  frames  collected  was  K  =  2.  Such  a 
system  would  be  appropriate  where  the  aberration  or  phase  errors  are 
fixed  or  slowly  varying.  Here,  we  assumed  that  the  aberrations  were 
fixed  over  the  collection  time. 


46 


ERIM 


Figure  3-7.  Phase  Retrieval  Image  Reconstructions  from  Noisy  Fourier 
Modulus  Data. 
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Figure  3-8.  Noisy  Fourier  Modulus  Data  used  in  the  Reconstructions 
Shown  in  Figure  3-7. 
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Figure  3-9.  Plot  of  the  Absolute  Error  of  the  Reconstructions  in 
Figure  3-7  as  a  Function  of  Fourier  Modulus  Error. 
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Three  different  objects  of  increasing  complexity  were  used  in  the 
simulation  to  demonstrate  the  overall  performance  of  the  combined 
modulus  estimation/image  reconstruction  algorithm  as  a  function  of 
image  content.  Figures  3-ll(a),  3-12(a)  and  3-13(a)  show  the  three 
objects  used.  Figure  3-10  shows  cuts  through  the  spin-averaged  Fourier 
modulus  of  each  object.  Each  of  the  objects  fits  within  a  64  x  64 
pixel  square,  and  a  128  x  128  array  was  used  in  the  reconstructions. 
The  object  of  Figure  3-ll(a),  called  "four  points,"  consists  of  three 
equal ly-bright  unresolved  points  and  a  fourth  part  being  a  2x4 
rectangle.  Figure  3-12(a),  referred  to  as  "satellite,"  is  a  model  of  a 
communications  satellite,  and  the  object  of  Figure  3-13{a),  "Bus/RV," 
is  a  simulated  post-boost  vehicle  with  several  attached  re-entry 
vehicles  (RV's)  and  one  detached  RV.  As  shown  in  Figure  3-10,  the 
Fourier  modulus  of  the  "four  points"  object  drops  off  slowly,  while  the 
moduli  of  the  "satellite"  and  "Bus/RV"  objects  drop  off  more  rapidly. 

In  each  of  these  simulation  experiments,  one  realization  of  a  two- 
frame  collection  was  simulated  and  an  estimate  of  I7I  was  formed  using 
Eq.  (3-22).  Next,  a  reconstruction  of  the  complex  mutual  coherence 
(and  hence  the  object  itself)  was  performed  by  using  the  iterative 
Fourier  transform  (IFT)  algorithm  [3.5-3.10],  using  positivity  and 
support  constraints. 

After  the  object  reconstruction  was  performed,  the  absolute  squared 
error  between  the  reconstruction  and  the  original  object  was  measured 
to  provide  a  quantitative  measure  of  algorithm  performance.  Since  the 
location  of  the  object  within  the  field  of  view  of  the  interferometer 
is  not  uniquely  determined  from  the  modulus  estimate,  the 
reconstruction  can  be  translated  with  respect  to  the  original  object. 
Also,  both  the  object  and  its  180*  rotation  have  the  same  Fourier 
modulus,  so  the  reconstruction  can  appear  rotated  by  180*  with  respect 
to  the  original.  Therefore  the  object  and  reconstruction  must  be 
registered  before  the  absolute  difference  can  be  calculated.  This 
registration  is  done  by  using  the  procedure  described  in  [3.8]. 
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Spin- Averaged  Visibility  of  Three  Objects 
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Figure  3-10.  A  Plot  of  Cuts  through  the  Spin-Averaged  Fourier  Moduli 
of  the  "Four  Points,"  "Satellite,"  and  "Bus/RV"  Objects. 
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(a)  Original 


Reconstructions  from  Unfiltered  Fourier  Magnitude  Data 


(b)  10  PhoVDet  (c)  20  Phot'Det  (d)  100  Phot/Det  (e)  500  Phot/Det 
Reconstructions  from  Wiener- Filtered  Fourier  Magnitude  Data 


(f)  10  Phot/Det  (g)  20  Phot'Det  (h)  100  Phot/Det  (i)  500  Phot/Det 


Figure  3-11.  Images  Reconstructed  from  Simulated  Amplitude 

Interferometer  Measurements  of  the  "Four  Points"  Object, 
(a)  Object;  (b)-(e)  images  reconstructed  from  unfiltered 
Fourier  modulus  data;  (f)-(i)  images  reconstructed  from 
Wiener  filtered  Fourier  modulus  data. 
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(a)  Original 

Reconstructions  from  Unfiltered  Fourier  Magnitude  Data 


(b)  100  Phot/Dot  (c)  200  Phot/Det  (d)  1 K  Phot  Det  (e)  5K  PhotDet 
Reconstructions  from  Wiener- Filtered  Fourier  Maanitude  Data 


(f)  100  Phot' Det  (g)  200  PhoVDet  (h)  IK  PhotDet  (i)  5K  Phot  Det 


Figure  3-12.  Images  Reconstructed  from  Simulated  Amplitude 

Interferometer  Measurements  of  the  "Satellite"  Object, 
(a)  Object;  (b)-(e)  Images  reconstructed  from  uiiflltered 
Fourier  modulus  data;  (f)-(1)  images  reconstructed  from 
Wiener  filtered  Fourier  modulus  data. 
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(a)  Original 

Reconstructions  from  Unfiltered  Fourier  Magnitude  Data 


^  1'^  1^ 


(b)  200  Phot/Det  (c)  500  PhotDet  (d)  2K  Phot/Det  (e)  5K  Phot/Det 

Reconstructions  from  Wiener- Filtered  Fourier  Magnitude  Data 


(f)  200  Phot/Det  (g)  500  Phot/Det  (h)  2K  Phol^Det  (i)  5K  Phot/Det 


Figure  3-13.  Images  Reconstructed  from  Simulated  Amplitude 

Interferometer  Measurements  of  the  "Bus/RV"  Object,  (a) 
Object:  (b)-(e)  images  reconstructed  from  unfiltered 
Fourier  modulus  data;  (f)-(i)  images  reconstructed  from 
Wiener  filtered  Fourier  modulus  data. 
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Figures  3-ll(b)-(d)  show  reconstructions  from  simulated  two-frame 
measurements  of  the  "four  points"  object  for  the  case  of  I^At  =  10,  20, 
100,  and  500  photons  per  detector  per  frame.  Figures  3-12(b)-(d)  show 
reconstructions  of  the  "satellite"  for  the  case  of  100,  200,  1000,  and 
5000  photons  per  detectors  per  frame.  Figures  3-13(b)-(d)  show 
reconstructions  of  the  "Bus/RV"  for  simulations  of  200,  500,  2000,  and 
5000  photons  per  detector  per  frame.  What  we  see  is  that  the  simpler 
"four  points"  object  requires  far  fewer  photons  for  a  reasonable 
reconstruction  then  the  "Bus/RV"  object.  The  locations  of  the  four 
points  can  be  seen  with  as  few  as  10  photons  per  detector  per  frame. 
The  "satellite"  object,  which  contains  glints,  also  reconstructs  with 
recognizable  features  down  to  100  photons  per  detector  per  frame. 

The  impact  of  Wiener  filtering  the  Fourier  modulus  estimates  before 

reconstruction  was  also  investigated.  The  Wiener  filter  has  been  shown 

to  be  the  optimal  filter  in  the  restoration  of  images  degraded  by 

additive  Gaussian  noise  [3.17]  but  it  also  plays  a  role  in  iterative 

image  reconstruction  algorithms  [3.18-3.20].  In  the  current  context, 

we  use  the  Wiener  filter  to  reduce  noise  artifacts  in  the 

reconstructions  which  arise  from  poor  estimates  of  the  high  spatial- 

frequency  components  in  the  modulus.  The  proper  Wiener  filter  W 

2 

requires  the  squared-modulus  I  FI  of  the  original  object  and  has  the 
form 


W(i.  j) 


IF 
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IFijl- 


(3-49) 


where  F^j  denotes  a  sample  of  the  Fourier  transform  F  and  is  the 
variance  of  the  estimate  Note  that,  to  a  first-order 

O  •  V  o 

approximation,  a  =  l/(2Io)  Independent  of  It^jI.  However,  I  FI  is 
unavailable.  As  a  first  pass,  we  formed  a  Wiener  filter  based  upon  the 
spin-average  iFl  of  the  Fourier  modulus: 
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Figures  3-ll(f)-(i),  3-12(f)-(1)  and  3-13(f)-(i)  show  the  corresponding 
reconstructions  from  Wiener-filtered  modulus  estimates  for  the  three 
objects.  As  one  would  expect,  the  high-frequency  noise  artifacts 
present  in  the  reconstructions  from  the  Wiener-filtered  data  are 
greatly  diminished,  but  some  of  the  resolution  has  also  been 
sacrificed. 


A  plot  of  the  absolute  root  mean-squared  error  of  the  various 
reconstructions  as  a  function  of  the  number  of  simulated  photons  per 
detector  per  frame  is  shown  in  Figure  3-14.  The  Bus/RV  object  requires 
two  orders  of  magnitude  greater  photons  than  the  four  points  object  to 
get  roughly  the  same  image,  quality.  On  the  other  hand,  the  Bus/RV 
object  has  two  orders  of  magnitude  more  illuminated  resolved  points 
than  the  four  points  object.  Consequently,  image  quality  was  similar 
for  the  same  number  of  photons  per  detector  per  illuminated  resolved 
point  on  the  target. 

3.4  SUMMARY 


Our  proposed  method  for  reconstructing  an  image  from  aberrated  low- 
light  level  aperture-plane  amplitude  interferometer  measurements  is  to 
first  form  an  estimate  of  the  squared  modulus  of  the  mutual  coherence 
and  then  to  reconstruct  a  diffraction-limited  image  by  using  phase 
retrieval . 


Two  amplitude  interferometer  systems  were  analyzed  in  which  a 
controllable  phase  term  5(t)  was  Introduced  in  order  to  allow 
measurement  of  the  squared  modulus  and  aberrated  phase  of  samples  of 
the  discretized  mutual  coherence  function;  one  in  which  ^(t)  took  on 
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Error  In  Reconstruction  for  Simulated  AI  Collections 
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Figure  3-14.  Plots  of  the  Absolute  RMS  Error  of  the  Reconstructed 
Images  Shown  in  Figures  3-11  through  3-13. 
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discrete  values  0  and  ir/2,  and  the  other  in  which  e(t)  varied  linearly 
over  [0,2ir].  It  was  found  that  squared-modulus  estimators  for  the 
discrete-phase  system  perform  better  than  the  estimators  for  the 
continuous-phase  system.  It  was  also  found  that  the  accuracy  of 
squared-modulus  estimates  has  a  strong  dependence  on  the  value  of  the 
squared-modulus,  as  illustrated  in  Figs.  3-4  and  3-5,  and  that  the 
dominant  source  of  error  was  the  standard  deviation  of  the  estimator. 
This  standard  deviation  results  from  the  fact  that  the  estimate  relies 
on  the  squared  difference  between  the  two  Poisson  random  variables, 

O  1 

and  The  dependence  of  the  performance  on  the  value  of  the 
squared-modulus  of  the  coherence  functions  also  results  in  the 
performance  being  much  better  for  point-like  objects,  for  which  the 
coherence  function  decreases  slowly  with  increasing  spatial  frequency, 
than  for  diffuse,  extended  objects,  for  which  the  coherence  function 
drops  rapidly  with  increasing  spatial  frequencies. 
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4.0  PREDICTION  OF  IMAGE  QUALITY  FOR  FUTURE  EXPERIMENTS 

In  this  section  we  describe  analysis,  simulation,  and 
reconstruction  results  that  would  predict,  the  quality  of  the  imagery 
that  can  be  expected  to  be  reconstructed  from  future  field  experiments. 
The  scenario  that  was  simulated  was  the  imaging  of  the  first  Firefly 
exercise  (piggybacking  on  the  MIT  Lincoln  Laboratory  laser  radar 
experiment)  launched  from  Wallops  Island  as  would  be  viewed  by  the  MAAI 
attached  to  the  48-inch  telescope  at  Goddard  Space  Flight  Center. 
Light  levels  received  by  the  MAAI,  assuming  sun  illumination  of  the 
target,  were  computed,  the  detected  data  was  simulated,  and  images  were 
reconstructed  from  the  simulated  data.  The  simulation  results  predict 
that  the  images  produced  from  the  MAAI  data  from  the  Goddard  48-inch 
telescope  would  be  of  poor  quality.  A  limiting  factor  was  that  the 
Goddard  48-inch  telescope  has  a  large  central  obscuration,  preventing 
the  measurement  of  the  low-to-mid  spatial  frequencies,  where  most  of 
the  information  resides.  However,  if  the  low  spatial  frequencies  were 
measured,  then  it  was  shown  that  good  quality  imagery  could  be 
reconstructed.  This  could  be  accomplished  by  changes  in  the  MAAI 
(which  will  be  described  later)  or  by  using  a  telescope,  such  as  the 
ISTEF  24-inch,  which  has  a  small  central  obscuration.  Then  for  the 
same  scenario,  images  would  be  reconstructed  with  resolution  far 
exceeding  that  ordinarily  allowed  by  atmospheric  turbulence. 
Furthermore,  if  the  same  experiment  were  performed  in  a  space-borne 
MAAI  at  the  same  range,  then  excellent  results  would  be  obtained,  even 
with  shorter  integration  times. 

In  Section  4.1  we  derive  expressions  for  received  light  levels  for 
the  cases  of  (1)  blackbody  emission  by  the  target,  (2)  sunlight 
reflected  by  the  target  and  (3)  laser  illumination  reflected  by  the 
target.  Then  we  predict  the  reflected  sunlight  levels  that  would  be 
obtained  for  the  Firefly  experiment  in  Section  4.2.  In  Section  4.3,  we 
comment  on  the  undersampling  problem  that  could  occur  in  the 
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experiment.  In  Section  4.4  we  show  digital  simulation  and 
reconstruction  experiments  that  demonstrate  the  image  quality  that 
would  be  obtained  under  various  assumptions. 

4.1  LIGHT  LEVEL  ESTIMATION  -  GENERAL  CASE 

4.1.1  Energy  Scattered  or  Radiated  by  the  Object 

There  are  three  cases  of  interest:  objects  emitting  in  the 
infrared,  objects  scattering  sunlight  in  the  visible  or  infrared,  and 
objects  scattering  laser  illumination  that  is  of  sufficiently  short 
spatial  and/or  temporal  coherence  to  be  effectively  incoherent.  In  the 
first  two  cases,  the  energy  must  be  weighted  by  the  spectral  filter 
which  determines  the  wavelength  band  to  be  detected.  The  total  energy 
is  determined  by  the  detector  integration  time. 

Using  a  blackbody  model  for  infrared  emission,  the  spectral 
radiance  L^  (energy  emitted  per  unit  time  per  unit  area  per  unit  solid 
angle  per  unit  wavelength)  of  an  object  is: 

L  =  ^ -  (4-1) 

X^[exp(hc/XkT)  -  1] 

where  h  is  the  Planck  constant,  c  Is  the  speed  of  light,  e  is  the 
object  emissivity,  X  is  the  wavelength,  k  is  the  Boltzmann  constant, 
and  T  is  the  object  temperature.  Ideally,  an  integration  is  required 
over  the  surface  of  the  object.  Including  the  effects  of  the  angle  6 
between  the  local  surface  normal  on  the  object  and  the  line  of  sight  to 
the  sensor  and  of  variations  in  the  emissivity  and  temperature,  to 
compute  total  energy. 
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For  sunlight  illumination,  the  spectral  radiance  of  an  object  is 
given  by  the  product  of  (1)  the  solar  spectral  irradiance  at  the 
object's  altitude,  (2)  factors  depending  on  the  angles  between  the 
object's  surface  normal  and  (a)  the  solar  illumination  direction  and 
(b)  the  line  of  sight  to  the  sensor,  and  (3)  the  object  reflectivity. 
(Ideally,  an  integration  is  required  over  the  surface  of  the  object.) 
Solar  spectral  irradiance  tables  can  be  found  in  The  Infrared  Handbook, 
Section  3.4  [4.1]. 

For  laser  illumination,  the  energy  scattered  per  unit  solid  angle 
is  the  product  of  the  transmitted  laser  energy,  one  way  transmittance 
losses  (e.g.,  due  to  atmospheric  propagation),  the  ratio  of  the  object 
cross-sectional  area  to  the  laser  beam  area  at  the  object  (including 
the  effect  of  nonuniform  beam  Intensity),  the  object  reflectivity 
(again,  including  nonuniform  effects),  and  the  reciprocal  of  the 
scattering  solid  angle.  For  rough  objects,  the  scattering  solid  angle 
can  approach  4jr  steradians.  However,  for  smooth  flat  objects,  the 
solid  angle  can  be  so  small  as  to  give  a  glint,  so  some  care  must  be 
taken  in  estimating  this  solid  angle. 

4.1.2  Transmittance  Losses 

Transmittance  losses  could  be  due  to  propagation  through  the 
atmosphere,  transmission  through  the  receiver  optics,  and  use  of  a 
polarizer. 

For  pulsed  laser  illumination,  there  is  an  additional  loss.  The 
amplitude  interferometer  can  collect  data  for  the  entire  object  only 
during  the  time  interval  over  which  light  is  arriving  from  all  parts  of 
the  object.  For  a  pulse  of  length  Lp  and  an  object  of  depth  AR  (along 
the  line  of  sight  to  the  amplitude  interferometer),  the  fraction  of  the 
pulse  which  may  be  used  (i.e.,  the  pulse  utilization  efficiency)  is  (Lp 
-  2AR)/Lp.  This  factor  is  unity  for  emissive  or  continuously 
illuminated  objects. 
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4.1.3  Receiver  Collection  Solid  Angle 

For  fixed  image  resolution,  the  collection  solid  angle  of  each 
detector  pixel,  i.e.,  the  solid  angle  it  subtends  with  respect  to  the 
object  plane,  is  (d^/R)  =  where  d^  is  the  area  of  a 
detector  pixel,  R  is  the  range  to  the  target,  X  is  the  mean  wavelength, 
a  is  the  desired  detector  oversampling  factor,  and  d^^j^  is  the  maximum 
object  diameter.  For  minimum  sampling  of  amplitude  interferometer 
data,  a  =  2. 


This  result  may  be  derived  as  follows.  For  resolution  Ad  at  the 
object,  the  receiver  aperture  must  be  of  diameter  D  =  XR/Ad.  For  an 
instantaneous  field-of-view  of  diameter  (at  the  object)  ad^  ,  where  d 
is  the  object's  diameter,  the  Nyquist  sample  spacing  at  the  aperture 
plane  is  Assuming  detector  elements  of  width  equal  to  the 

detector  spacing,  the  solid  angle  of  a  detector  element  is  therefore 
(X/adom)^*  There  are  D/UR/ad^jji)  =  ad^ji^/Ad  detectors  across  the 
aperture. 

4.1.4  Parametric  Formulas 

For  thermal  emission,  the  energy  per  detector  (i.e.,  the 

product  of  the  factors  discussed  above)  is; 


e  A  cos  g  at  tX 

X^  [exp  (hc/XkT)  -  1]  (ad  )^ 


(4-2) 


where 


c  is  object  emissivity 
T  is  object  temperature 
is  maximum  object  diameter 
A  is  object  cross-sectional  area 
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X  is  mean  wavelength 

AX  is  wavelength  band 

At  is  detector  integration  time 

^atm  atmospheric  transmittance 

r  4.  is  receiver  optics  transmittance 
opt 

Tpoi  is  polarizer  transmittance 
a  is  the  desired  oversampling 

6  is  the  angle  between  the  object  surface  normal  and  the 
direction  to  the  sensor 


h  is  the  Planck  constant 
k  is  the  Boltzmann  constant 
c  is  the  speed  of  light. 


Note  that  all  integrations  over  spatial  and  wavelength  variations  have 

2 

been  approximated.  For  a  =  2  (the  minimum  allowable),  A  =  *’(dQ^/2)  i 
and  cos  6  =  the  formula  becomes; 


The  e  At  AX 

16  X^  [exp  (hc/XkT)  -  1] 


(4-3) 


It  should  be  noted  that  for  determination  of  detected  signal -to-noise 
ratio,  the  background  light  level  must  also  be  determined  and  the 

•k 

detectivity  D  of  the  detector  determined. 

For  sunlight  illumination,  the  energy  per  detector  is; 


om 


(4-4) 
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where 

Ex  is  solar  spectral  irradlance 

is  maximum  object  diameter 
om 

A  is  object  cross-sectional  area 
’"obj  object  reflectivity 
X  is  mean  wavelength 
AX  is  wavelength  band 
At  is  detector  integration  time 
^atm  atmospheric  transmittance 
Topt  is  receiver  optics  transmittance 
’^pol  polarizer  transmittance 
a  is  desired  oversampling 

5^.  is  the  angle  between  the  object  surface  normal  and  the  solar 
illumination  direction 

is  the  angle  between  the  object  surface  normal  and  the 
direction  to  the  sensor 

and  it  has  been  assumed  that  the  object  is  a  Lambertian  scatterer.  All 
integrations  over  spatial  and  wavelength  variations  have  been 
approximated.  For  a  =  2,  A  =  ,  and  ^  =  0  =  45*,  the  formula 

becomes 


^X  '"ob.1  “^atm  ^opt  ^pol  ^ 


(4-5) 


For  laser  illumination,  the  energy  per  detector  is; 


2  2 
^  ^atm  ^area  ’’obj  *^pulse  ^opt  ‘'^pol  ^ 

om 


(4-6) 
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where 

E  is  the  transmitted  laser  energy 

^atm  atmospheric  transmittance 

r  is  the  ratio  of  object  to  beam  area 
area 

r^l^j  is  the  object  reflectivity 

Q  is  the  scattering  solid  angle 

’^pulse  pulse  utilization  efficiency,  (L  -  2AR)/L 

L  is  the  laser  pulse  length 

AR  is  the  object  depth 

d  „  is  the  object  diameter 
om 

r  *  is  the  receiver  optics  transmittance 
opt 

Tpoi  is  the  polarizer  transmittance 
X  is  the  wavelength 
a  is  the  desired  oversampling. 

Note  again  that  any  integrations  have  been  approximated. 

4.1.5  Example  Calculations 

For  thermal  emission,  the  energy  per  detector  is  1.2  x  10”^^  Joule 

4 

or  6  X  10  photons  for 
0.5 

300" K  (sun  illuminated) 

5  meters 

10  /tm  (near  blackbody  peak) 

0.5  /tm 

1  ms 
1.0 
0.1 
0.5 

2 

0". 


e  = 
T  = 

^om 
X  = 

AX  = 

At  = 

^atm 

^opt 

^pol 
a  = 

0  * 
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Note  that 


=  4  81 

XkT  • 

exp  (hc/XkT)  -  1  =  121  , 


^  =  2  X  10“^°  J  . 


For  sunlight  illumination,  the  energy  per  detector  is  0.55  x  10' 
Joule  or  15  photons  for 

2 

Ex  =  1942  W/m  fim  (exo  atmospheric) 

^obj  " 

At  =  10  ms 

X  -  0.5  fim 

AX  »  0.03  fim 

=  1.0 


Note  that 


^  =  4  X  lO"^®  J. 


For  laser  illumination,  the  energy  per  detector  per  pulse  is 


1.2  X  10"^°  Joule  or  6  photons  for 


E  -  1  Joule/pulse 

=  1.0 


^>ERIM 


’^area  " 

'"obj  " 

Q  =  2% 


Impulse  =  0-75  (AR  =  5  m,  L  =  40  m  or  130  nsec) 


=  0-1 

Tpjj^  =  1.0  (no  polarizer) 
a  =  2 

d„„  =  5  meters 
om 


\  =  1  lim. 


Note  that 


p  =  2  X  10"^®  J  . 


In  the  above, 


h  =  6.63  X  10”^^  Joule  sec 

Q 

c  =  3  X  10  m/sec 
he  =  1.99  X  10"^^  Joule  m 
k  =  1.38  X  10'^^  Joule/'K 

*  0.0144  m*K  . 


4.2  LIGHT  LEVEL  ESTIMATION  -  FIREFLY  EXPERIMENTS 

In  this  section  we  estimate  the  light  level  expected  from  the  first 
Firefly  experiment  when  imaging  the  large  cylindrical  object. 

For  sunlight  illumination,  the  number  of  detected  photons  (photo¬ 
electrons)  per  detector  per  frame  Is,  for  a  general  object, 


(4-7a) 


68 


^>ERIM 


=  f}^  AX  At(X/hc)  (A  cosB^  cos5^/r)  (df 

(4-7b) 

=  ,q  AX  At(X/hc)  r^bj  [(Ld^/2ir)  V($.  +  *^)]  r^p^  Tp^l  (d^  rj^/R^) . 

(4-7c) 

The  parameters  in  this  expression  and  their  estimated  values  for 
the  Firefly  experiment  are  listed  in  Table  4-1.  Equation  (4-7b)  was 
obtained  from  Eq.  (4-4)  by  replacing  the  oversampling  ratio,  a,  by 


a 


d  d 
a  om 


(4-8) 


2 

where  d^  is  the  detector  spacing  and  rj^  d^  is  the  area  per  detector 
element.  Equation  (4-7c)  is  obtained  making  the  further  substitution 
of  (Ld^/air)  V('l'^  +  IIq)  for  (A  cos  0^  cos  0^)/^  for  the  cylindrical 
target  in  the  Firefly  experiment.  The  object  is  assumed  to  be  a 
cylinder  of  length  L  and  diameter  d^.  In  Appendix  C  the  theory  of  a 
reflecting  cylinder  is  worked  out  In  detail,  and  the  energy  reflected 
by  the  cylinder,  assumed  to  be  a  Lambertian  reflector,  is  proportional 
to 


V(*^  +  *0)  =  (1/2)  [sin(f^  +  *q)  -  (t^  +  tjj)  cos(t^  +  f^^)]  (4-9) 


where  is  the  angle  of  the  sun  below  the  horizontal  and  is  the 
angle  of  the  receiver  below  the  horizontal,  as  seen  from  the  target. 
For  the  Firefly  cylinder,  V(f^  -  t^)  is  about  0.292,  as  compared  with  a 
maximum  possible  value  of  t/2  =  1.57  for  illumination  from  the  same 
angle  as  the  sensor  views  the  object  (i.e.,  for  the  sun  behind  the 
sensor) . 
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Table  4-1 

Parameters  for  Firefly 


Parameter 

Parameter 

Parameter 

Value 

Symbol 

Name 

0.10 

’q 

detector  quantum  efficiency 

1942  W/m^//im 

Ex 

solar  spectral  irradiance 

0.50  fiW 

X 

mean  wavelength 

4  X  10"^^J 

hc/X 

energy  per  photon 

0.03  /im 

AX 

wavelength  band 

10  msec 

At 

detector  frame  integration  time 

2.4  m 

L  =  ^om 

maximum  object  diameter 

varies 

®i 

the  angle  between  the  object  surface 
normal  and  the  solar  illumination 
direction 

varies 

^0 

the  angle  between  the  object  surface 
normal  and  the  direction  to  the  sensor 

0.8 

’^atm 

atmospheric  transmittance 

0.056 

^opt  ^pol 

receiver  optics  transmittance 

600  km 

R 

range  to  target 

0.4 

fractional  active  detector  area 

(3  cm  X  4  cm) 

o 

'^au  "  5av 

detector  element  center-to-center  spacing 

0.0009  m*^ 

"dO, 

area  per  detector  element 

For  the  cylindrical  object: 

0.4  m 

do 

cylinder  diameter 

2.4  m 

L 

cylinder  length 

10“ 

♦i 

solar  angle  below  horizon 

55®  +  8® 

*0 

sensor  angle  below  object  plane 

107® 

180®-#^ 

bistatic  angle 

73® 

*i  ^  *0 

180®  -  (bi static  angle) 

0.292 

+  *0) 

reflectivity  factor  (Appendix  C) 
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For  the  parameters  listed  in  Table  4-1,  Np^  =0.1  photo-electrons 
per  detector  (in  10  msec).  In  one  second  this  would  be  10  photo¬ 
electrons,  and  in  150  sec  of  observing  time  1500  photo-electrons  would 
be  detected.  As  seen  in  Table  4-2,  150  sec  would  be  available  between 
times  200  sec  and  350  sec  from  launch,  during  which  period  the  target 
would  appear  to  be  relatively  stationary  as  viewed  from  the  Goddard 
site.  At  most,  3200  photons  could  be  detected  during  the  320  seconds 
between  times  130  and  450  seconds  from  launch. 

At  3,200  photo-electrons  per  detector,  one  can  achieve  a  normalized 
mean-squared  error  (NRMSE)  of  0.1  (suitable  for  phase  retrieval)  for 
I7I  down  to  0.5,  and  one  can  achieve  a  NRMSE  of  0.5  (suitable  for 
parameter  estimation  from  the  Fourier  modulus)  for  I7I  down  to  about 

0.1. 


4.3  SAMPLING  REQUIREMENTS 

For  the  parameters  listed  in  Table  4-1,  a  =  2.08  if  the  3  cm 
detector  spacing  direction  is  oriented  along  the  long  axis  of  the 
cylinder,  but  a  =  1.56  if  it  is  oriented  in  the  opposite  way.  Recall 
that  a  =  2.0  is  required  for  Nyquist  sampling  of  I7I  .  Since  this 
opposite  orientation  was  contemplated,  serious  problems  could  arise. 
For  this  reason  it  is  worthwhile  to  review  the  basis  for  this  sampling 
requirement. 

For  a  shear  of  Au,  7(Au)  requires  a  sample  spacing  of 

Au  i  ^  (4-10) 

in  order  to  avoid  aliasing  and  satisfy  the  Nyquist  criterion,  where  L 
is  the  length  of  the  target.  Recall  from  Section  2  that  the 
interferometer  measures  l7(2Au,  2Av) r  for  a  detector  at  location 
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Table  4-2 

Firefly  Launch  Parameters  as  Viewed  from  Goddard 


Time 

(sec) 

Range 

(km) 

Elevation 

(deg) 

Bi static  angle 
(deg) 

Comment 

379 

50.3 

112.5 

Rising  fast 

200 

506 

55.6 

107.5 

350 

664 

55.4 

108.3 

Release  cannister 

450 

685 

50.7 

113.4 

Dropping  fast 
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(Au,Av).  Therefore  a  doubling  of  the  sampling  rate  is  required  because 
of  the  squaring  operation  (a  function  squared  has  twice  the  bandwidth 
as  the  original  function),  and  another  doubling  of  the  sampling  rate  is 
required  because  of  the  180*  rotational  shear  giving  the  spacing 
(2Au,2Av).  Therefore  the  detector  spacing  must  be 

“a  ^  ^ 

which  is  3.1  cm  for  R  =  600  km,  X  =  0.5  x  10~^  m  and  L  =  2.4  m. 

4.4  DIGITAL  SIMULATION  EXPERIMENTS 

A  model  of  the  Firefly  payload  is  shown  in  Figure  4-1.  In  this 
case  we  are  imaging  the  cylindrical  object  (which  later  separates  into 
two  parts)  2.4  m  long  and  0.4  m  diameter  with  a  nozzle  at  one  tnd. 
(The  simulated  reentry  vehicle  was  judged  to  be  too  small  and  dim  for 
an  initial  demonstration  of  amplitude  interferometry.)  Because  of  the 
oblique  illumination  angle,  it  would  not  be  realistic  to  use  a 
digitized  version  of  this  photograph  as  the  object  for  our  digital 
experiments.  So  instead,  we  fashioned  a  three-dimensional  shape  from 
wood  and  painted  it  white  with  a  black  stripe.  Shown  in  the  CCD-camera 
image  in  Figure  4-2(a),  it  has  features  that  are  similar  to  those  of 
the  Firefly  object.  Figure  4-2(b)  is  a  photograph  of  the  same  object 
illuminated  from  below  and  behind  at  an  angle  approximating  the  one  at 
which  the  sun  would  be  shining  at  the  Firefly  object.  At  the  nearly 
grazing  angle  involved,  a  weak  glint  on  the  left  half  of  the  object 
appeared  despite  the  fact  that  the  paint  used  (Liquid  Paper  white-out) 
was  not  glossy. 

Figure  4-2(c)  shows  the  image  as  would  be  seen  from  a  diffraction- 
limited  phase-measuring  amplitude  interferometer  (as  though  there  were 
such  a  thing)  of  aperture  diameter  1.2  m  (48  in),  operating  at  a 
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wavelength  of  0.5  /im  at  a  range  of  600  km,  with  no  noise.  This  Image 
was  obtained  by  Fourier  transforming  the  object  shown  In  Figure  4-2(b), 
multiplying  by  a  circular  aperture  in  the  Fourier  plane  of  the 
appropriate  size,  and  inverse  Fourier  transforming.  From  this  we  see 
that  for  the  large  cylindrical  object  under  this  illumination 
condition,  even  under  the  most  ideal  conditions  the  best  that  could 
ever  be  done  with  a  1.2  m  telescope  is  to  see  a  thin  line  that  curves 
upward  at  one  end  (where  it  is  thinner  at  the  nozzle)  and  has  a  barely 
discernible  dark  band  near  the  other  end.  This  illustrates  the  need 
for  very  large  apertures  for  discrimination. 

Figure  4-2(d)  shows  an  image  that  would  be  obtained  from  a 
diffraction-limited  phase-measuring  MAAI  using  a  1.2  m  aperture  having 
a  0.6  m  central  obscuration,  like  the  Goddard  48-inch  telescope  has. 
Because  of  the  large  central  obscuration,  all  the  low-to-mid  spatial 
frequencies  are  not  measured  —  only  the  high  spatial  frequencies  are 
measured,  and  the  result  is  a  high-pass  filtered  version  of  the  image 
shown  in  Figure  4-2 (b).  The  same  image  features  are  seen,  but  very 
large  ringing  artifacts  dominate  the  image.  The  narrow  width  of  the 
image  can  no  longer  be  reliably  estimated.  Discrimination  would  be 
difficult  with  this  aperture  even  with  ideal  imaging  with  the  phase. 
To  get  an  image  comparable  to  that  shown  in  Figure  4-2(c),  the  Fourier 
data  would  have  to  be  interpolated  from  the  high  spatial  frequencies 
into  the  mid  and  low  spatial  frequencies. 

Figure  4-2(e)  shows  the  image  that  would  be  obtained  from  a 
diffraction-limited  phase-measuring  MAAI  using  a  0.6  m  (24  inch)  filled 
aperture,  and  Figure  4-2(f)  shows  the  image  that  would  be  obtained  from 
a  diffraction-limited  phase-measuring  MAAI  using  a  0.6  m  aperture  with 
a  0.1  m  central  obscuration,  like  a  telescope  that  is  available  at  the 
Innovative  Science  and  Technology  Experimental  Facility  (ISTEF)  on  Cape 
Canaveral.  The  image  is  lower  in  resolution  by  a  factor  of  two,  as 
expected,  but  the  ringing  artifacts  are  much  less  pronounced  than  for 
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the  Goddard  48-inch,  since  the  ISTEF  24-inch  has  a  very  small  central 
obscuration. 

If  the  telescope  were  being  operated  in  space,  and  if  the 
aberrations  were  unknown  but  were  slowly  varying  over  the  integration 
time,  then  the  method  using  only  two  frames,  described  in  Section  3, 
could  be  used.  As  discussed  in  Section  4.2,  for  the  first  Firefly 
experiment  with  the  Goddard  48-inch  telescope  and  the  then-current 
implementation  of  the  MAAI,  about  1,500  to  3,200  photons  per  detector 
could  be  obtained  during  the  integration  time.  Data  was  simulated  with 
2,000  photons  per  detector  over  two  frames  for  each  of  the  four 
apertures  described  above.  The  iterative  transform  algorithm  was  used 
to  retrieve  the  phase  over  the  aperture  and,  for  the  annular  apertures, 
simultaneously  interpolate  the  complex  values  into  the  mid  and  low 
spatial  frequencies  where  no  data  would  be  measured.  (Section  5.0  and 
Appendix  D  describe  the  algorithm  in  more  detail.)  The  reconstructed 
images,  shown  in  Figure  4-2(g)-(j),  are  comparable  in  quality  to  the 
diffraction-limited  images  from  the  filled  apertures.  In  fact,  for  the 
48-inch  Goddard  annular  aperture,  the  reconstructed  image  is  actually 
better  than  a  diffraction-limited  image  with  a  phase-measuring  MAAI 
[compare  Figure  4-2(h)  with  4-2(d)].  This  results  from  the  success  of 
the  interpolation  of  the  mid  and  low  spatial  frequencies  that  would 
otherwise  be  lost.  This  is  a  remarkable  success  for  the  phase 
retrieval /interpolation  algorithm  operating  on  MAAI  data. 

We  also  performed  experiments  with  lower  numbers  of  photons, 
corresponding  to  proportionally  shorter  integration  times.  For  only 
400  total  photons  per  detector  over  the  two  frames,  which  is  1/5  the 
light  level  expected  for  the  Firefly  experiment,  the  major  features  of 
the  object  are  still  seen  in  the  reconstructed  image,  although  the 
image  is  noticeably  noisier  than  the  one  for  2,000  photons  per 
detector. 
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For  an  earth-bound  telescope,  atmospheric  turbulence  limits  the 
integration  time  for  a  single  frame  to  about  10  msec.  Therefore  during 
a  200  sec  total  integration  time,  one  must  collect  20,000  frames  of 
data  of  exposure  time  10  msec  each.  We  simulated  4,000  total  photons 
per  detector  over  the  20,000  frames.  Note  that  this  is  equivalent  to 
an  average  of  1/5  photon  per  detector  per  frame.  That  is,  most 
detectors  would  receive  zero  photons  in  a  given  frame.  This  data  is 
extremely  noisy,  to  say  the  least.  By  summing  over  20,000  frames  the 
signal-to-noise  ratio  is  built  up.  The  image  reconstructed  from  this 
simulated  data  for  the  Goddard  48-inch  and  ISTEF  24-inch  annular 
apertures  are  shown  in  Figrre  4-3 (e)  and  (h)  respectively.  For  the 
Goddard  48-inch  aperture,  large  amounts  of  noise  fill  the  support 
constraint  used  during  the  iterations.  A  hint  of  the  long,  thin  object 
is  seen  in  the  image,  but  the  high  level  of  noise  would  cause  one  to 
have  little  confidence  in  it.  This  Illustrates  the  fact  that,  even  if 
a  large  number  of  photons  are  collected,  if  they  are  spread  over  too 
many  frames,  they  are  not  as  effective  as  the  same  number  of  photons 
spread  over  a  small  number  of  frames.  The  interpolation,  which  worked 
well  for  the  case  of  2  frames  for  a  space-based  sensor,  work  poorly 
here  since  the  coherence  function  squared-modulus  estimate  Is  so  much 
noisier.  As  shown  in  Figure  4-3(h),  the  image  reconstructed  from  the 
same  number  of  photons  per  detector  and  the  same  number  of  frames,  but 
for  the  ISTEF  24-inch  aperture.  Is  much  less  noisy  and  clearly  shows 
the  major  features  of  the  object  although  at  only  half  the  resolution. 
This  greatly  improved  result  is  due  to  the  fact  that  the  much-smaller 
central  obscuration  requires  far  less  Interpolation.  Then  the 
interpolation  task  is  much  easier  and  the  Image  quality  is  limited  only 
by  the  aperture  size  and  the  performance  of  the  phase  retrieval 
algorithm. 

Since  the  atmospheric  "seeing"  can  be  expected  to  have  a 
correlation  distance  of  about  0.05  meters  under  these  circumstances, 
the  ISTEF  24-inch  (0.6  m)  Image  shown  in  Figure  4-2(h)  has  resolution 
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ifi)  Diffuse  ll'umination  (bf  Sunlit 


Simulated  Collection  Through  Goddard  Aperture  {20K  Frms,  4K  Phot  Tot.) 


(C)  Diff  Lim.  Image  (d)  Founer  Mag.  le)  Reconstructed  Image 


Simulated  Collection  Through  ISTEF  Aperture  {20K  Frms,  4K  Phot  Tot.) 


(f)  Diff.  Lim.  Image  (g)  Fourier  Mag.  (h)  Reconstructed  Image 


Figure  4-3.  Object  and  Reconstructed  Images  for  Simulation  of  Ground- 
Based  Imaging  through  Atmospheric  Turbulence  with  the 
Amplitude  Interferometer,  (a)  Model  diffusely 
illuminated;  (b)  model  illuminated  by  spotlight;  for  the 
Goddard  48-inch  aperture:  (c)  diffraction-limited  image, 
(d)  Fourier  modulus,  (e)  reconstructed  image;  for  the 
ISTEF  24-inch  aperture:  (f)  diffraction-limited  image, 
(g)  Fourier  modulus,  (h)  reconstructed  image. 
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about  (0.6m/0.05m)  =  12  times  better  than  what  would  be  seen  with  a 
diffraction-limited  telescope  viewing  the  same  object  through  the  same 
turbulent  atmosphere.  In  fact,  the  blur  circle  for  atmospheric-limited 
imaging  in  this  case  would  be  XR/r^  =  6  m,  which  is  2.5  times  wider 
than  the  length  of  the  target.  Therefore  an  image  of  this  target  from 
a  conventional  diffraction-limited  telescope  would  be  a  large  blob 
showing  no  detail  whatsoever,  whereas  the  image  from  the  MAAI  operating 
with  the  ISTEF  24-inch  would  show  recognizable  features  of  the  object. 
This  demonstrates  the  tremendous  advantage  of  using  the  MAAI  under  the 
right  circumstances. 

Figures  4-2 (d)  and  4-2 (g)  show  the  MAAI  data  (squared-modulus  of 
the  coherence  function)  simulated  over  the  48-inch  Goddard  aperture  and 
the  24-inch  ISTEF  aperture  for  the  ground-based  case.  The  vertical 
streak  down  their  centers  is  due  to  the  fact  that  the  target  is  long 
and  thin  in  the  opposite  direction.  The  holes  in  the  centers  are  due 
to  the  central  obscurations  of  the  telescopes.  Note  that  in  the 
horizontal  dimension,  in  which  the  target  is  resolved,  the  signal-to- 
noise  ratio  rapidly  drops  away  from  the  center.  This  helps  to  explain 
why  the  Goddard  aperture  worked  so  poorly.  The  central  obscuration  of 
the  Goddard  48-inch  is  about  the  same  size  as  the  entire  24-inch  ISTEF 
aperture.  That  is,  the  annulus  of  data  gathered  by  the  Goddard  48-inch 
would  only  start  beyond  the  outer  diameter  of  the  ISTEF  24-inch.  Since 
at  this  point  the  data  has  become  quite  noisy,  we  see  that  the  Goddard 
48-inch  would  miss  the  data  where  the  signal -to-noise  ratio  is  good  and 
measure  it  where  the  signal -to-noise  ratio  is  primarily  poor.  For  this 
reason  it  is  important  to  change  the  way  that  the  MAAI  measures  data 
with  telescopes  like  the  Goddard  48-inch  —  modifications  are  necessary 
to  measure  the  low  spatial  frequencies,  even  if  it  means  missing  some 
of  the  highest  spatial  frequencies.  This  is  described  in  Section  6. 
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5.0  IMAGE  RECONSTRUCTION  WITH  A  PARTIALLY-FILLED  APERTURE 

For  the  case  of  partially-filled  aperture,  including  central 
obscurations  or  multi  pie -mirror  telescopes,  portions  of  the  spatial 
frequency  domain  are  not  measured.  One  way  to  get  around  this  problem  is 
to  change  the  way  that  the  aperture  is  sheared  by  the  interferometer  so 
that  it  measures  the  lower  spatial  frequencies.  When  this  is  done  the 
highest  spatial  frequencies  are  lost,  but  the  net  image  quality  can  be  far 
higher  than  what  would  be  obtained  with  the  traditional  method  of  shearing 
the  wavefront.  This  alternative  shearing  approach  is  described  in 
Section  6.  If  the  alternative  shearing  approach  is  not  taken,  then  the 
reconstruction  algorithm  must  simultaneously  interpolate  the  missing  phase 
and  modulus  values  where  they  are  missing  while  retrieving  the  phase  where 
the  modulus  is  measured.  This  is  necessary  because  the  impulse  response 
of  a  partially-filled  aperture  usually  has  large  sidelobes  that  go  both 
positive  and  negative,  which  interferes  with  both  the  support  constraint 
and  the  nonnegativity  constraint  used  by  the  phase  retrieval  algorithm. 
This  is  a  particularly  difficult  task  if  the  lower  spatial  frequencies  are 
missing  because  of  a  central  obscuration  of  the  telescope,  since  the 
visibility  modulus  at  lower  spatial  frequencies  is  typically  much  larger 
than  at  the  higher  spatial  frequencies.  How  we  accomplished  this  and  the 
results  are  briefly  summarized  below.  A  detailed  description  is  given  in 
Appendix  D. 

The  method  of  simultaneous  phase  retrieval  and  interpolation  is  a 
modification  of  the  standard  iterative  transform  algorithm.  One  iteration 
consists  of  the  usual  four  steps,  but  with  the  following  change  in  the 
second  step  in  the  Fourier  domain:  where  the  Fourier  modulus  is  measured, 
the  compirted  Fourier  modulus  is  replaced  by  the  measured  modulus;  where 
the  Fourier  modulus  is  not  measured  but  is  within  the  area  that  would  have 
been  occupied  by  a  filled  aperture  of  the  same  diameter,  the  Fourier 
modulus  is  unchanged;  and  beyond  the  area  that  would  have  been  occupied  by 
the  filled  aperture,  the  Fourier  modulus  is  set  to  zero.  If  any  phase 
information  has  been  measured  in  any  region,  then  in  that  region  the 
computed  phase  is  replaced  by  the  measured  phase. 
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We  found  that  for  filled  apertures  with  no  phase  information,  the 
iterative  transform  algorithm  usually  converges  reasonably  quickly  to  the 
correct  solution.  For  a  partially  filled  aperture  with  no  phase 
information,  for  which  both  phase  retrieval  and  interpolation  are 
required,  successful  reconstructions  were  obtained,  but  only  when  the 
central  obscuration  was  small.  This  was  for  the  case  of  a  very  extended 
object.  As  was  seen  in  Section  4,  for  a  simpler  object,  reconstructions 
of  this  type  are  also  possible  with  a  larger  central  obscuration  if  the 
signal -to-noise  ratio  (light  level)  is  very  high. 

We  also  experimented  with  interpolation  when  the  phase  is  measured. 
Problems  with  nonunique  solutions  were  encountered  if  the  missing  region 
was  large.  Therefore  the  difficulty  with  combined  phase  retrieval  and 
interpolation  may  be  limited  more  by  the  interpolation  than  by  the  phase 
retrieval  in  some  circumstances. 
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6.0  ALTERNATIVE  AMPLITUDE  INTERFEROMETER  FOR  GROUND-BASED  EXPERIMENTS 

In  order  to  avoid  the  problems  with  the  reconstruction  algorithms 
that  occur  when  the  telescope  has  a  central  obscuration,  the  way  that  the 
aperture  is  sheared  by  the  interferometer  can  be  changed  so  that  it 
measures  the  lower  spatial  frequencies.  When  this  is  done  the  highest 
spatial  frequencies  are  lost,  but  the  net  image  quality  can  be  far  higher 
than  what  would  be  obtained  with  the  traditional  method  of  shearing  the 
wavefront.  This  is  important  for  ground  based  experiments  using  existing 
telescopes,  although  it  would  probably  not  be  a  problem  for  an  eventual 
space-based  system  for  which  a  second  small  telescope  could  fill  the  need 
for  the  low  spatial  frequencies. 

The  usual  geometry  for  the  180*  rotational  shear  and  the  detectors 
is  shown  in  Figure  6-l(A).  Only  the  right  half  of  Figures  6-l(A),  (B)  and 
(C)  get  through  one  side  of  th**  Koster's  prism.  The  annular  aperture  is 
rotate  180*  about  its  center  and  interfered,  so  that,  for  a  symmetric 
aperture,  the  sheared  and  combined  fields  occupy  the  same  area  as  the 
original  aperture.  The  detector  array  (shown  shaded),  on  one  of  the  two 
sides  of  the  Koster's  prism,  covers  only  half  of  the  aperture,  but  that  is 
all  that  is  needed  since  the  coherence  function  is  symmetric  about  the 
origin.  The  low  to  mid  spatial  frequencies  surrounding  the  origin  in 
spatial  frequency  space,  indicated  by  a  dot  in  the  figure,  are  all 
missing.  The  low  to  mid  spatial  frequencies  are  measured  by  either  of  the 
alternative  geometries  shown  in  Figure  6-1 (B)  and  (C).  In  these  cases  the 
fields  are  translated  horizontally  (B)  or  vertically  (C)  prior  to  rotation 
by  180*  so  they  are  rotated  about  points  other  than  the  center  of  the 
aperture.  For  the  cases  shown  in  Figure  6-1 (B)  and  (C),  the  rotations  are 
about  points  half  way  between  the  inner  and  outer  radii  of  the  annulus. 
That  point  is  the  location  of  the  origin  of  spatial  frequency  space,  and 
all  the  low  to  mid  spatial  frequencies  around  it  are  measured.  This  can 
be  accomplished  simply  by  shifting  the  optical  axis  of  the  interferometer 
making  it  offset  with  respect  to  the  optical  axis  of  the  telescope.  For 
a  ratio  of  radii  of  2:1,  for  the  geometry  of  Figure  6-l(B)  in  the 
horizontal  direction  the  highest  spatial  frequency  passed  is  reduced  to 
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Figure  6-1.  Alternative  Pupil  Shearing  and  Detection  Geometries  for 
Annular  Apertures.  The  shaded  rectangles  are  potential 
areas  for  the  detector  array  to  cover.  (A)  Conventional 
geometry;  (B)  alternative  geometry  with  horizontal  effort; 
(C)  alternative  geometry  with  vertical  offset. 
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1/4  that  of  the  usual  geometry,  and  in  the  vertical  direction  the  highest 

spatial  frequency  passed  is  ^7/16  =  0.66  that  of  the  usual  geometry. 

Since  the  width  of  the  overlap  region  is  narrow  near  the  highest  spatial 
frequencies,  the  highest  practical  spatial  frequency  is  about  1/2  that  of 
the  usual  geometry.  This  dimension  should  be  oriented  along  the  dimension 
for  which  resolution  is  most  important. 

For  the  geometry  of  Figure  6-l(C),  the  highest  spatial  frequency 
passed  is  0.66  that  of  the  usual  geometry  in  the  horizontal  direction  and 
1/4  in  the  vertical  dimension  compared  with  the  usual  geometry,  and  the 
detector  array  is  closer  to  a  square  shape. 

In  these  cases,  for  the  same  number  of  detector  elements,  the 
alternative  geometries  have  twice  the  field-of-view  in  each  dimension  as 
the  usual  geometry,  and  the  fraction  of  detector  elements  that  are  used  is 
increased  from  58.9%  to  82.6%.  Most  importantly,  the  low  and  mid  spatial 
frequencies,  where  |k|  is  large,  are  measured,  enabling  image 
reconstruction  at  much  lower  light  levels. 

Another  potential  operating  mode  would  be  to  have  a  system  which 
flips  between  the  two  geometries,  which  could  be  accomplished  with,  say, 
a  movable  mirror.  Then  alternately  both  the  low  spatial  frequencies  and 
the  highest  spatial  frequencies  could  be  measured. 
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7.0  IMAGE  RECONSTRUCTION  USING  A  DECONVOLUTION  ALGORITHM 

An  alternative  to  the  iterative  transform  phase  retrieval  algorithm 
(which  was  the  workhorse  algorithm  for  most  of  this  effort)  was  developed. 
It  is  a  version  of  the  Ayers-Dainty  blind  deconvolution  algorithm  modified 
to  solve  the  phase  retrieval  problem,  using  support  and  nonnegativity 
constraints. 

In  the  blind  deconvolution  problem,  one  is  given  an  image  g{x)  which 
is  the  convolution  of  two  arrays,  f{x)  and  h(x),  neither  of  which  is 
known,  and  both  of  which  we  wish  to  reconstruct  from  g{x).  The  Fourier 
transform,  G(u),  of  g{x)  is  given  by  the  product  of  the  Fourier  transforms 
of  f (x)  and  h(x) : 

G(u)  =  F(u)  H(u)  .  (7-1) 

Phase  retrieval  is  a  special  case  of  blind  deconvolution  for  which  g(x)  is 
the  autocorrelation  of  the  object  (given  by  the  inverse  Fourier  transform 
of  the  squared  Fourier  modulus),  f(x)  is  the  unknown  object,  and  h(x)  is 
the  twin  image  (hermitian  conjugate)  of  f(x),  and  we  are  given  lF(u)|^  = 
F(u)  F*(u).  The  Ayers/Dainty  algorithm  iteratively  estimates  F,  f,  H,  and 
h  by  inverting  Eq.  (7-1)  and  using  constraints,  such  as  support  and 
nonnegativity,  on  f  and  h. 

Our  analysis  showed  that  the  algorithm,  modified  to  the  phase 
retrieval  problem,  has  properties  similar  to  the  error-reduction  version 
of  the  iterative  transform  algorithm.  It  converges  slowly  but  seems  to 
handle  noise  well,  pe^-haps  due  to  a  built-in  Wiener  filter  that  we  use  to 
invert  Eq.  (7-1). 

A  detailed  description  of  the  -»ew  phase  retrieval  algorithm  and  some 
results  of  computer  simulations  and  reconstructions  are  given  in 
Appendix  E. 
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8.0  NUMERICAL  INVESTIGATION  OF  PHASE  RETRIEVAL  UNIQUENESS 

A  question  that  always  arises  for  image  reconstruction  by  phase 
retrieval  is  whether  the  image  obtained  is  unique.  If  it  were  likely  that 
other  images  were  also  consistent  with  the  data  and  constraints,  then  the 
method  would  not  be  reliable.  A  new  methodology  of  quantifying  the 
uniqueness  of  the  solution  was  developed  and  exercised.  The  subspace  of 
all  ambiguous  solutions  was  analytically  derived  for  the  case  of  small  (up 
to  3  X  3  pixels)  images.  If  an  image  is  a  distance  from  this  subspace 
less  than  the  measurement  noise  of  the  Fourier  modulus  data,  then  it  is 
consistent  with  an  ambiguous  image.  If  the  ambiguous  counterpart  to  the 
ambiguous  image  is  very  different  from  the  original  object,  then  the 
solution  is  ambiguous  in  a  practical  sense.  For  2x2  and  3x2  images, 
Monte  Carlo  experiments  were  conducted  to  determine  the  probability  that 
a  random  image  would  lie  within  a  certain  distance  of  this  subspace.  It 
involved  a  reduced-gradient  search  along  the  subspace  of  ambiguous  images 
to  determine  the  ambiguous  image  closest  to  a  given  image.  It  was  found 
that  for  small  amounts  of  noise,  the  probability  of  having  an  ambiguous 
image  is  small.  As  the  noise  level  increases,  the  probability  of  having 
a  practical  ambiguity  increases. 

The  surface  of  ambiguous  images  for  the  3x2  case  is 
five-dimensional,  embedded  in  a  six-dimensional  space.  On  the  other  hand, 
for  the  3x3  case,  the  ambiguous  images  lie  in  a  seven-dimensional 
surface  embedded  in  a  nine-dimensional  space.  Since  the  ambiguous  images 
in  the  latter  case  have  dimension  two  less  than  the  space,  it  seems  that 
they  would  be  far  less  likely  to  occur.  Therefore,  for  larger  images  of 
practical  interest,  the  probability  of  ambiguity  is  probably  less  than 
what  we  computed  for  the  3  x  2  images. 

We  also  explored  the  relationship  between  ambiguous  solutions  and 
local  minima  encountered  by  phase  retrieval  algorithms. 

The  most  important  aspect  of  this  task  was  the  development  of  a 
methodology  for  determining  the  probability  of  uniqueness  in  a  practical 
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sense.  If  successfully  extended  to  the  larger  images  of  interest,  it 
could  yield  a  practical  estimate  of  probability  of  ambiguity,  and  of  the 
reliability  of  phase  retrieval. 

A  detailed  description  of  this  study  of  the  uniqueness  of  phase 
retrieval  is  given  in  Appendix  F. 
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9.0  ASSESSMENT  OF  COMPUTATIONAL  REQUIREMENTS 

The  computational  requirements  for  phase  retrieval  were  analyzed. 
Versions  of  the  algorithm  were  also  sent  to  other  researchers  to  implement 
on  particular  computer  architectures,  such  as  the  Carnegie-Mellon  Warp. 

Each  iteration  of  the  iterative  transform  phase-retrieval  algorithm 
involves  two  2-0  fast  Fourier  transforms  (FFT's)  and  some  additional 
operations  in  the  two  domains.  These  additional  operations  include 
addition,  subtraction,  multiplication,  division,  and  square  root.  For 
some  versions  of  the  algorithm  it  is  also  necessary  to  compute  sin,  cos, 
arctangent  (i.e.  conversion  between  real -imaginary  and  modulus-phase), 
logical  NOT,  and  clipping  0).  All  of  these  operations  are  done 
independently  on  2-D  arrays  of  numbers,  so  that  they  are  well -suited  to 
vector  processor  or  parallel  computing  architectures.  The  2-D  FFT's  are 
similarly  well-suited  to  vector  or  parallel  architectures,  since  the  row 
(or  column)  1-D  FFT's  can  be  done  in  parallel.  If  fully  optimized,  the 
largest  computational  burden  will  ordinarily  be  the  FFT's.  Since 
typically  dozens  to  hundreds  of  iterations  are  required  for  convergence, 
depending  on  the  difficulty  of  the  particular  reconstruction  problem,  the 
primary  computational  burden  is  dozens  to  hundreds  of  FFT's  to  compute  a 
single  image.  For  the  SDI  discrimination  application,  all  this  must  be 
done  in  a  short  time,  say  20  msec.  Consider  this  example:  if  the  FFT 
array  size  were  N  x  N  =  64  x  64,  considering  that  each  2-D  FFT  requires 
about  2N^  logjN  complex  floating-point  operations  (CFLOP's),  the 
computational  rate  required  to  perform  100  iterations  (200  FFT's)  in  20 
msec  would  be  about  500  MegaCFLOP/sec.  Consequently,  substantial 
parallelism  in  the  computing  architecture  is  currently  necessary  to 
perform  these  algorithms  in  the  short  times  allowed.  This  could  be  done 
currently  with  a  Cray  Y/MP  supercomputer.  Efforts  are  underway  to  put 
this  level  of  computing  power  in  a  small  package  (a  size  less  than  that  of 
a  five-pound  coffee  can). 
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Our  own  computing  hardware  experience  has  seen  a  substantial  speed¬ 
up  over  time  with  machines  of  comparable  cost.  A  single  iteration  for 
array  size  N  x  N  =  128  x  128  took  1.00  seconds  with  a  Floating  Point 
Systems  AP120B  array  processor  in  1980,  0.60  seconds  with  a  Mars  Numerix 
432  array  processor  in  1986,  and  0.15  seconds  with  the  Carnegie-Mellon 
Warp  computer  (tests  performed  by  H.T.  Kung's  group)  in  1988.  The  latter 
time  was  dominated  by  (a)  a  non-optimized,  slow  square  root  function  and 
(b)  a  corner-turn  required  to  be  performed  in  the  host  computer  rather 
than  interior  to  the  Warp.  With  an  optimized  square  root  function  and  a 
larger  memory  within  the  Warp,  this  time  could  be  reduced  by  a  factor  of 
two. 


In  a  realistic  space-based  scenario,  special-purpose  electronic 
processors  would  be  used  instead  of  the  general-purpose  processors 
described  above.  Typical  speed-ups  of  special-purpose  electronic 
processors  over  general-purpose  processors  is  typically  in  the  range  of 
100  times  to  1,000  times.  Projected  general-purpose  processors  should  be 
adequate  for  the  job.  Therefore,  if  special-purpose  electronic  processors 
were  developed,  then  the  computational  requirements  for  phase  retrieval 
would  be  easily  achieved. 
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10.0  LABORATORY  EXPERIMENTS 

It  was  intended  that  Images  be  reconstructed  from  MAAI  data 
gathered  in  the  laboratory.  The  data  was  to  be  collected  by  the 
University  of  Maryland  (UMd)  in  their  laboratories.  ERIM  prepared  test 
targets  of  appropriate  objects  for  use  in  the  experiments  and  delivered 
them  to  UMd.  The  targets  were  those  digitized  images  shown  in  Section 
2.2.  They  were  written  onto  fine-grained  film  using  an  Eikonix  laser- 
beam  recording  system.  Transparencies,  as  opposed  to  reflective 
objects,  were  used  in  order  to  maximize  the  intensity  of  the  light  that 
would  enter  the  MAAI.  Transparencies  were  produced  at  a  variety  of 
magnifications  in  order  to  match  the  size  requirements  of  the 
experimental  setup.  Special  care  was  taken  to  make  the  background 
density  of  the  transparencies  as  dark  as  possible  to  avoid  a  background 
term.  No  MAAI  data  was  gathered  in  the  laboratory  during  this  effort. 

Phase  retrieval /image  reconstruction  software  that  resided  on  a 
Heurikon-hosted  Mercury  Zip  Array  Processor  at  ERIM  was  delivered  to 
UMd  and  extensive  assistance  was  given  to  UMd  by  ERIM  to  get  the 
software  to  work  on  the  Micro-Vax-hosted  Zip  at  UMd.  Considerable 
effort  was  required  to  overcome  operating  system  incompatibilities 
(Unix  vs.  VMS).  This  transition  was  made  to  enable  UMd  to  perform 
image  reconstruction  both  at  UMd  locally  and  at  remote  test  range 
sites. 
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APPENDIX  A 

EXPRESSIONS  FOR  BIAS  AND  MEAN-SQUARED-ERROR 


In  this  Appendix,  approximate  algebraic  expressions  are  derived  for  the  bias  and 
mean-squared  error  (MSE)  of  several  of  the  estimators  for  |7,jp  discussed  in  Sec¬ 
tion  3.  These  bias  and  MSE  expressions  can  then  be  used  to  compute  the  normalized 
bias  (NB)  and  normalized  root  mean-squared  error  given  by  Equations  (3-30).  To 
aid  in  the  computation  of  expressions  for  the  bias  and  MSE,  the  symbolic  compu¬ 
tation  software  Maple  [A-1]  was  used.  Section  A.l  contains  expressions  for  the  bias 
and  and  MSE  for  four  estimators:  Dl,  Equation  (3-17);  D2,  Equation  (3-18);  Cl, 
Equation  (3-26);  and  C2,  Equation  (3-28).  Listings  of  the  Maple  sources  used  to 
generate  the  resulting  expressions  are  given  in  Section  A. 2 


A.l  ALGEBRAIC  EXPRESSIONS 

The  following  methodology  was  used  in  computing  expressions  for  the  bias  and 
MSE  associated  with  estimators  Dl  and  Cl.  Note  that  estimator  Dl  consists  of 
a  sum  of  terms  which  involve  the  ratio  of  the  photon  difference  Nf^f.  —  N^^f.  to  the 
photon  sum  -t-  as  in  (3-17).  Similar  ratios  are  required  for  Estimator  Cl. 
Consequently,  direct  expressions  for  bias  and  mean-squared  error  associated  with 
Dl  and  Cl  are  difficult  to  compute.  Instead,  to  compute  the  bias  and  MSE  of  these 
two  estimators,  we  use  asymptotic  expansions  for  terms  involving  {N^jk  + 

The  resulting  expressions  for  bias  and  MSE  can  then  be  expressed  in  terms  of  a 
power  series  in  /J"*.  Approximate  expressions  for  biais  and  MSE  are  then  calculated 
by  truncating  the  respective  series  representations  after  the  first  few  terms.  In 
the  expressions  below’,  the  first  four  terms  (zeroth,  first,  second  and  third-order) 
are  maintained  and  the  resulting  accuracy  of  the  expressions  for  bias  and  MSE 
are  therefore  of  order  Maple  was  used  as  an  aid  performing  the  required 

symbolic  computations.  In  the  following  expressions,  the  subscripts  denote  the 
estimator,  c  is  defined  in  (3-13),  and  the  term  Q  is  related  to  the  number  of  frames 


A-1 


as: 


Q  =  sin(arg  7,j  + 

k=\ 


(A-1) 


ESTIMATOR  Dl: 


13.375/0%, ^  i/o^ 

- r"  +  "A - - ^3 - +  -^  (A-2) 


MSEdi 


4/0-' C7.,^  8/o-%./Q 


,  6/o-^,/Q  3/o-^c^ 

I<  K  A'2 

516.5/o-%./Q  6.5/o-S„2  _  3.5/o-"c 


(A-3) 


ESTIMATOR  Cl: 


Bi>3„  - 


,_2  0.6918643184/o-%i/ 

+/o  ^ 

45.87288805/o-%,/  Jg^ 

c3  c 


(A-4) 


MSEci  ^  20.51851925/o-^  C7./  -  22 -64794786/0-^  7./ 

-11.06982909/o-%./  +  8c"/o-'  + 

+8.5/-3c  -  ^5.50759459/0-%,,^  __  901.2612945/o-%„^ 

c  cP  ^  ^ 

An  alternative  methodology  was  used  in  the  computation  of  expressions  for  the 
bias  and  MSE  associated  with  estimators  D2  and  C2.  Expressions  for  the  bias  and 


mean-squared  error  for  estimators  D2  and  C2  are  simplified  by  ignoring  the  variance 
and  higher-order  moments  of  denominator  term  involving  Iq.  As  can  be  seen  from 
Equation  (3-6).  the  standard  deviation  of  /q  is  inversely  proportional  to  the  square- 
root  of  the  product  of  the  number  of  pixels  A'^^,  the  number  of  frames  K.  and  the 
integration  time  A<.  Furthermore,  since  (A'/^*.  +  A^,^*.)  ^  Poisson  random  variable, 

its  standard  deviation  is  the  square-root  of  its  expectation,  i.e..  i/Jq-  Combining 
these  relationships,  the  normalized  standard  deviation  of  Iq  is: 


Std.  Dev. 


1 

A’vTT^' 


(A-6) 


Therefore,  for  sufficiently  large  /o.  A’  and  A',  we  can  ignore  fluctuations  in  Iq  in 
the  computation  of  the  bias  and  mean-squared  error  of  estimators  D2  and  C2.  As 
a  result,  formulas  for  the  first  and  second  moments  of  estimators  D2  and  C2  are 
straightforward  to  compute.  Again,  Maple  was  used  as  an  aid  in  the  computation. 


Estimator  D2: 


Bias£)2  = 


(A-7) 


MSEd2  — 


4/o"'  lij^c 


+ 


3/o 


-■‘0  It]  2 

K 


A'2 


2^0 


'.'2 
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Estimator  C2: 


Biasc2  =  24  '  c 


(A-9) 


MSEc2  20.518519257i,4/o-’ 


A-3 


t 


+8/o"V  +  5.129C298137o  S-/ 

+\lo"c  (A-10) 


References: 

[A-1]  Char,  B.  W.,  Geddes,  K.  O.,  Gaston,  H.  G,  Monagan,  M.  B.,  and  Watt,  S.  M., 
Maple  Reference  Manual,  5th  ed.,  WATCOM  Pub.  Ltd.,  Waterloo,  Ont,  1988. 

A.2  MAPLE  SOURCE  CODE 


Listings  of  the  Maple  input  used  to  generate  expressions  for  NB  and  NRMSE  for 
each  of  the  four  estimators  considered  above  are  included  in  this  section.  The  fol¬ 
lowing  file  visibility  contains  procedures  used  in  all  of  the  computations.  Maple 
listings  related  to  each  of  the  estimators  Dl,  D2,  Cl  and  C2  follow. 

visibility: 

« 

#  File:  visibility 

#  Date:  18  Jul  88 

#  Author:  J.  D.  Gorman,  ERIM 

# 

#  Intent:  Computes  an  expansion  for  the  fringe  visibility  measurement  V 

#  in  terms  of  two  new  random  variables  PSI  and  ETA,  and  raises  it  to 

#  the  Nth  power. 

#  Let  NS(K)  and  ND(K)  be  the  number  of  photons  detected  at  the 

#  sinisterous  and  dexterous  arms  of  the  amplitude  interferometer 

#  respectively  so  that: 

#  E{  ND(K)  }  =  I.O  (1  -  Gm(K))  *  LambdaD(K) 

#  E{  NS(K)  }  »  I.O  (1  +  Gm(K))  «  LambdaS(K). 

« 

#  Then  we  define  the  random  variables: 

» 

#  PSI(K)  »  {[ND(K)-LambdaD(K)]  +  [NS(K)-LambdaS(K)3}/  sqrt(2  I  0) 

# 

#  ETA(K)  «  C*{[ND(K)-LambdaD(K)]  -  [NS(K)-LambdaS(K)] } 

•  /  {Gm(K)*sqrt(2I_0)}, 
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#  where: 

#  C  *  {1  +  2*I_Bckgnd}/I_0. 

« 

#  The  result  is  that: 

# 

#  [ND(K)  -  NS(K)]  [1  +  ETA(K)] 

#  - a - . 

#  [ND(K)  +  NS(K)]  [1  +  PSI(K)] 

» 

#  This  ratio  is  expanded  as  a  series  and  then  the  terms  of  order  0  or 

#  greater  are  retained. 

# 

mean_ratio  ;=  proc(K) 

result  :=  expand(  (Gm(K)*l)  *  simplify (  expectationC  visibility (1 ,K) ,K  )  )  ); 
end; 

mean_ratio_5q  :=  proc(K) 

result  :*  expandC  (Gm(K)‘2)  *  simplifyC  expectationC  visibility (2 ,K) ,K  )  )  ); 
end ; 

mean.ratio_t  :=  proc(K) 

result  :=  expandC  CGjnCK)'3)  ♦  simplifyC  expectationC  visibility C3,K) ,K  )  )  ); 
end; 

mean_ratio_q  :=  procCK) 

result  :=  expandC  CGmCK)“4)  *  simplifyC  expectationC  visibilityC4,K) ,K  )  )  ); 
end; 

visibility  :=  procCN,  K) 
local  tmp,  resUit; 

tmp  :=  subsC  X=psiCK),  convertC  taylorCl/Cl+X) ,X=0 , 10) ,  polynom  )  ); 

result  :=  Cl+etaCK))"N  ♦  tmp*N; 

end; 

Is  :*  procCK) 

result  :=  Cc  -  GmCK))/CiOinv) ; 
end; 

Id  :*  procCK) 

result  :■  Cc  +  GmCK))/CiOinv) ; 
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end; 


« 

#  The  following  procedures  are  used  to  calculate  the  expectation  of 

#  various  moments  of  the  fringe  visibility. 

« 

etapsi  :=  proc(m,  n,  k) 

local  i,  j,  jnki,  jnkj .  result,  rsum,  tl,  t2,  t3; 
option  remember; 

if  type(m, integer)  and  m  >  0  and  type (n, integer)  and  n  >  0  then 
rsum  ; =  0 ; 

for  i  from  0  by  1  to  m  do 
for  j  from  0  by  1  to  n  do 
jnki  :=  i; 
jnkj  :=  j; 

tl  :=  binomial (m,i)  *  binomial (n,j)  ♦  (-l)“i; 

t2  :*  pcffi(ls(k) ,i+j)  *  pcm(ld(k) ,m+n-i-j) ; 

t3  :=  ((2.0*Gm(k)/i0inv)"m)  *  ((2.0*c/i0inv)"n) ; 

rsum  :*  rsum  +  ((tl  ♦t2)/t3); 

od; 

od; 

el if  type (m, integer)  and  m  >  0  and  type (n, integer)  amd  n  -  0  then 
rsum  ; =  0 ; 

for  i  from  0  by  1  to  m  do 
jnki  :=  i; 

tl  ;=  binomial (m,i)  *  (-l)“i; 

t2  :=  pcm(ls(k),i)  ♦  pcm(ld(k) ,m-i) ; 

t3  (2.0*Gm(k)/i0inv)"a; 

rsum  :=  rsum  +  ((tl  *t2)/t3); 

od ; 

el if  type(m, integer)  and  m  =  0  and  type (n, integer)  and  n  >  0  then 
rsum  :*  0; 

for  j  from  0  by  1  to  n  do 
jnkj  :«  j; 

tl  :=  binomiaKn,  j)  ; 

t2  :«  pcm(ls(k),j)  ♦  pcm(ld(k) ,n-j) ; 

t3  :®  ((2.0*c/i0inv)"n) ; 

rsum  :*  rsum  +  ((tl  *t2)/t3); 

od; 

fi; 
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result  ;=  rsum; 
end; 


subetapsi  :=  proc(X,K) 
local  i.  result,  tl,  t2; 
option  remember; 
tl  :=  X; 

for  i  from  12  by  ~1  to  1  do 
for  j  from  12  by  -1  to  1  do 
if  i  >  5  or  j  >  5  then 
t2  :=  subs(  eta(K)‘i  *  psi(K)‘j 
else 

t2  :=  subs(  eta(K)‘i  ♦  psi(K)‘j 

f  i; 

tl  :=  t2; 
od 
od 
end; 

subpsi  :=  proc(X,K) 
local  i,  result,  tl,  t2: 
option  remember; 
tl  :=  X; 

for  i  from  12  by  ”1  to  1  do 
if  i  >  5  then 

t2  :=  subs(  psi(K)*i  *  0,  tl  ) 
else 

t2  :=  subs(  psi(K)*i  *  etapsi(0,i,K) ,  tl  ); 

fi; 

tl  :=  t2; 

od 

end; 

subeta  :*  proc(X,K) 
local  i,  result,  tl,  t2; 
option  remember; 
tl  ;=  X: 

for  i  from  12  by  -1  to  1  do 
if  i  >  5  then 

t2  ;=  subs(  eta(K)'‘i  *  0,  tl  ) 


=  0,  tl  ) : 

a  etapsi (i , j ,K) ,  tl  ) 
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else 

t2  :=  subsC  eta(K)*i  =  etapsi(i,0,K) ,  tl  ) 

f  i; 

tl  ;*  t2; 

od 

end; 

expectation  ;=  proc(X,K) 
local  result,  tl,  t2; 
option  remember; 
tl  :=  expand(X); 
t2  :=  subetapsi (tl ,K) ; 
tl  subeta(t2 ,K) ; 
t2  :=  subpsi(tl,K) ; 
result  :=  t2; 
end; 


» 

#  This  procedure  calculates  the  Nth  central  moment  of  a  Poisson 

#  random  variable  having  parameter  X. 

f 


pcm  ;=  proc(X,N) 
local  result,  Y,  tmp; 
if  type (N, integer)  then 
if  N  ®  0  then  result  ;*  1 
elif  N  ■  1  then  result  :« 
el if  N  *  2  then  result  :® 
elif  N  =  3  then  result  :* 
elif  N  =  4  then  result  :* 
elif  N  ■  5  then  result  :® 
elif  N  *  6  then  result  ;« 
elif  N  *  7  then  result  :® 
elif  N  »  8  then  result  :« 
else 

tmp  ;»  YfN^pcmCY, (N-2))  + 
result  :■  subs(Y*X,  tmp); 
fi; 
fi; 


0 

X 

X 

X  +  3*X'2 

X  +  10*X-2 

X  +  25*X“2  +  15*X'3 

X  +  56*X‘2  +  105*X'3 

X  +  119*X“2  +  409*X‘3  +  105*X‘4 

diff(  pcm(Y,(N-l)).  Y  ); 


result 
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end; 
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Estimator  Dl: 
read(  visibility  ); 

« 

#  File:  Ncurrie.mse 

#  Date:  19  Oct  88 

#  Author:  J.  D.  Gorman 

# 

mean_ratio_sq_K  :*  me2Ln_ratio_sq(K)  : 

« 

#  Calculate  BIAS  of  Discrete-Phase  Normalized  Estimator 

» 

sum_mean_ratio_sq  :*  proc() 

local  tmpl,  tmp2,  result; 

tmpl  :=  expandC  mean_ratio_sq_K  ); 

tmp2  :®  subs(  Gm(K)“2  «  GM“2* (nfr ames/2 ) ,  tmpl  ); 

result  :»  tmp2; 

end ; 

bias  :=  simplifyC  ((2/nframes)*sum.mean_ratio_sq())  -  GH“2  ); 

# 

#  Calculate  Terms  in  MSE 

« 

sq_sum_mean_ratio_sq  :®  proc() 

local  tmpl,  result; 

tmpl  :=  sum_mean_ratio_sq() ; 

result  :=  expand(  tmpl“2  ); 

end; 

sum_sq.mean_ratio_sq  :«  proc() 
local  tmpl,  tmp2,  tmp3,  tmp4,  result; 
tmpl  :*  mean_ratio_sq_K; 
tmp2  :®  expandC  tmpl‘2  ); 

tmp3  :*  subsC  Gm(K)“4  ■  GM“4* (qsumfnf fames) ,  tmp2  ); 
tmp4  :»  subsC  Gm(K)'2  »  GM“2*(nframes/2) ,  tmp3  ); 
result  :«  tmp4; 
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end; 


suin_inean_ratio_q  :=  procO 
local  tmpl,  tmp2,  tmp3,  result; 
tmpl  ;=  expandC  inean_ratio_q(K)  ); 

tinp2  ;=  subs(  Gin(K)'4  =  GM*4*(qsuin*nfrajnes) ,  tmpl  ); 
tmp3  :=  subs(  Gm(K)‘2  =  GM'2*(nframes/2) ,  tmp2  ); 
result  :*  tmp3; 
end; 

expected_quad_term  :=  (4/nf rames'2)  *  (  sum_mean_ratio_q() 

-  sum_sq_mean_ratio_sq()  +  sq_sum_mean_ratio_sq()  ) ; 

mse  ;=  simplify(  expected_quad_term  ~  GM~4  -  (2*bias*GM  2)  ); 

« 

#  Simplify  BIAS  and  MSE 

# 


bias  ;=  simplify (  expand(  bias  )  ); 
mse  :®  simplifyC  expaaid(  mse  )  ); 


bias.cO  ;=  simplifyC 
bias.cl  ;=  simplifyC 
bias_c2  :=  simplifyC 
bias_c3  :=  simplifyC 


coeffC  expandC  bias 
coeffC  expandC  bias 
coeffC  expandC  bias 
coeffC  expandC  bias 


) ,  iOinv,  0  )  ) ; 
),  iOinv,  1  )  ); 
) ,  iOinv,  2  )  ) ; 
) ,  iOinv,  3  )  ) ; 


mse.cO  :*  simplifyC  coeffC  mse,  iOinv,  0  )  ); 
mse_cl  :*  simplifyC  coeffC  mse,  iOinv,  1  )  ); 
mse_c2  :*  simplifyC  coeffC  mse,  iOinv,  2  )  ); 
mse_c3  :*  simplifyC  coeffC  mse,  iOinv,  3  )  ); 


bias.cO  :=  expeindC  bias_cO  ) 
bias_cl  :=  expandC  bias.cl  ) 
bias_c2  :=  expandC  bias_c2  ) 
bias_c3  ;*  expandC  bias_c3  ) 


mse.cO  :®  expandC  mse_cO  ) 
mse_cl  :*  expandC  mse_cl  ) 
mse_c2  :®  expandC  mse_c2  ) 
mse_c3  :*  expandC  mse_c3  ) 
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bias.expr 

mse.expr 

snr.expr 


.=  bias  cO  +  bias.cl.iOinv  -  bias.c2.(i0in»-2)  *  bias  c3.(i0inv-3)  •, 
•  mse.cO  +  mse.cl*iOin»  *  mse.c2»(i0inv-2)  +  inse.c3»(i0ii.v  3), 

=  mse.expr  -  bias^expr  2; 


■bias_expr  ;*  expandC  bias_expr  ), 
inse_expr  :=  expandC  nise_expr  ), 
snr.expr  :=  expandC  snr_expr  ); 


latex C  bias^expr  ); 
latex C  mse_expr  ); 
latex C  snr.expr  ); 


Estimator  D2: 


readC  visibility  )  ; 

« 

#  File:  ac_mse 

#  Date:  19  Oct  88 

#  Author:  John  D.  Gorman 

« 

ND  :=  proc(k) 

result  :=  ld(k)  +  dd(k); 

end; 

NS  :=  proc(k) 

result  :=  ls(k)  +  ds(k); 

end; 

mean.diff  :=  proc() 

result  :*  expectddds(  expandC  (0.5*i0inv)*l  ♦  (ND(K)  -  NS(K))“1  ),  K  ); 
end; 

mean_diff_sq  :*  proc() 

result  :=  expectddds(  expeindC  (0 .5*i0inv) "2  *  (ND(K)  -  NS(K))*2  ),  K  ); 
end; 

mean_diff_t  :=  procO 

result  :=  expectdddsC  expandC  (0 .5*i0inv) “3  *  (ND(K)  -  NS(K))“3  ),  K  ); 
end; 

inean_diff_q  :=  procC) 

result  :=  expectdddsC  expandC  C0.5*i0inv)“4  ♦  CNDCK)  ~  NSCK))‘4  ),  K  ); 
end; 

expectddds  :*  procCX.K) 
local  i,  result,  tl,  t2,  t3; 
option  remember; 
tl  :«  X; 

for  i  from  4  by  -1  to  1  do 

t2  ;*  subsC  C  ddCK)  )“i  *  pcmC  IdCK),  i  ),  tl  ); 

t3  :»  subsC  C  dsCK)  )“i  =  pcmC  IsCK),  i  ),  t2  ); 
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tl  :=  t3; 

od 

end; 

mean_diff _sq_K  ;*  expandC  mean_diff _sq(K)  ); 

» 

#  Calculate  BIAS  of  Discrete-Phase  Normalized  Estimator 

» 

sum_mean_dif f _sq  :=  procO 
local  tmpl ,  tmp2 ,  tmp3 ,  result ; 
tmpl  :*  mean.diff _sq_K; 
tmp2  :*  expandC  tmpl  ); 

tmp3  :*  subsC  Gm(K)‘2  =  GM“2*(nframes/2) ,  tmp2  ); 

result  :=  tmp3; 

end; 

bias  :®  simplifyC  ((2/nframes)*sum.mean.diff_sq())  -  GM'2  ); 

« 

#  Calculate  Terms  in  MSE 

« 

sq_sum_mean_diff _sq  ;=  proc() 
local  tmpl,  result; 
tmpl  :®  sum_mean_diff_sq() ; 
result  :*  expandC  tmpl'2  ); 
end; 

sum_sq_mean_diff _sq  ;«  procC) 

local  tmpl,  tmp2,  tmp3,  tmp4,  tmp5,  result; 

tmpl  ;=  mean_diff _sq_K; 

tmp2  :*  expandC  tmpl"2  ); 

tmp3  :=  subsC  GmCK)“4  «  GM'4*Cqsum*nfraines) ,  tmp2  ); 
tmp4  :=  subsC  GmCK)“2  «  GM‘2*Cnframes/2) ,  tmp3  ); 
result  :=  tmp4; 
end; 

sum.mean_diff_q  :«  procC) 

local  tmpl,  tmp2,  tmp3,  tmp4,  result; 
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tmpl  :=  mean_diff _q(K) ; 
tmp2  :=  expand (  tmpl  ) ; 

tmp3  :=  subsC  Gm(K)'4  =  GM“4*(qsum*nf names) ,  tmp2  ); 
tmp4  :=  subs(  Gm(K)“2  =  GM*2*(nframes/2) ,  tmp3  ); 
result  :*  tmp4j 
end; 

expected_quad_term  :=  (4/nframes'2)  *  (  sq_sum_mean_diff _sq()  -  sum_sq_mean_diff . 
mse  simplifyC  expected_quad_term  -  GM"4  -  (2*bias*GM“2)  ); 

« 

#  Simplify  BIAS  and  MSE 

« 


bias  :=  simplifyC  expandC  bias  )  ); 
mse  :=  simplifyC  expandC  mse  )  ); 


bias_cO 

bias.cl 

bias_c2 

bias_c3 


simplifyC  coeffC  expemdC  bias 
simplifyC  coeffC  expandC  bias 
simplifyC  coeffC  expandC  bias 
simplifyC  coeffC  expandC  bias 


) ,  iOinv,  0  )  ) 
),  iOinv,  1  )  ) 
),  iOinv,  2  )  ) 
),  iOinv,  3  )  ) 


mse_cO 

mse_cl 

mse_c2 

mse_c3 


simplifyC  coeffC  mse,  iOinv,  0 
simplifyC  coeffC  mse,  iOinv,  1 
simplifyC  coeffC  mse,  iOinv,  2 
simplifyC  coeffC  mse,  iOinv,  3 


)  ) 
)  ) 
)  ) 
)  ) 


bias.cO  :=  expandC  bias_cO  ) 
biased  :=  expandC  bias_cl  ) 
bias_c2  :=  expandC  bias_c2  ) 
bias_c3  :=  expandC  bias_c3  ) 


mse.cO  :=  expandC  mse.cO  ); 

mse_cl  :=  expandC  mse_cl  ); 

mse_c2  :=  expandC  mse_c2  ); 

mse_c3  :=  expandC  mse_c3  ); 

bias_expr  :»  bias.cO  +  bias_cl*iOinv  +  bias_c2*Ci0inv'‘2)  +  bias_c3*Ci0inv‘3)  ; 
mse.expr  ;*  mse_c0  +  mse_cl*iOinv  +  mse_c2*Ci0inv“2)  +  mse_c3*Ci0inv‘3) ; 
snr.expr  :*  mse_expr  -  bias_expr‘2; 
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bias_expr  ;=  expandC  bias_expr  ); 
mse_expr  :*  expandC  mse_expr  ); 
snr.expr  :=  expandC  snr_expr  ); 

latex C  bias.expr  ) ; 
latex C  mse.expr  ); 
latex C  snr.expr  ); 

quit ; 
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Estimator  Cl: 


# 

#  File;  Nac.mse 

#  Date:  22  Aug  88 

#  Author:  John  D.  Gorman 

« 

read(  visibility  ) ; 

G  :=  expandC  (A  -  C)*2  +  (B  -  D)~2  ); 

Gsq  :=  expandC  G‘2  ); 

tildeGA  :=  expandC  CtildeA  -  tildeC)'2  +  CtildeB  -  tildeD)“2  ) ; 
tildeGAsq  ;=  expajidC  tildeGA“2  ); 

tildeGB  :=  subsC  tildeA"2  =  tildeAsq,  tildeGA  ); 
tildeGBq  :=  subsC  tildeA"4  =  tildeAq,  tildeGAsq  ); 
tildeGBt  ;=  subsC  tildeA“3  =  tildeAt,  tildeGBq  ); 
tildeGBsq  ;=  subsC  tildeA“2  *  tildeAsq,  tildeGBt  ); 

tildeGC  ;=  subsC  tildeB*2  =  tildeBsq,  tildeGB  ); 
tildeGCq  :=  subsC  tildeB“4  *  tildeBq,  tildeGBsq  ); 
tildeGCt  ;=  subsC  tildeB*3  =  tildeBt,  tildeGCq  ); 
tildeGCsq  :=  subsC  tildeB“2  =  tildeBsq,  tildeGCt  ); 

tildeGD  ;=  subsC  tildeC“2  =  tildeCsq,  tildeGC  ); 
tildeGDq  :=  subsC  tildeC‘4  =  tildeCq,  tildeGCsq  ); 
tildeGDt  :=  subsC  tildeC‘3  *  tildeCt,  tildeGDq  ); 
tildeGDsq  :*  subsC  tildeC*2  =  tildeCsq,  tildeGDt  ); 

tildeG  :=  expandC  subsC  tildeD“2  =  tildeDsq,  tildeGD  )  ); 
tildeGq  :=  subsC  tildeD"4  =  tildeDq,  tildeGDsq  ); 
tildeGt  ;*  subsC  tildeD*3  =  tildeDt,  tildeGq  ); 
tildeGsq  :®  expandC  subsC  tildeD"2  =  tildeDsq,  tildeGt  )  ) ; 

A  :=  GmCl) ; 

B  :*  GmC2) ; 

C  :*  GmC3); 

D  :*  GinC4)  ; 

tildeA  :*  mean.ratioCl) : 
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tildeAsq  ;=  meaii_ratio_sq(l)  : 
tildeAt  :=  niean_ratio_t(l) : 
tildeAq  :®  mean_ratio_q(l) : 

tildeB  :*  mean_ratio(2) : 
tildeBsq  :=  ineaii_ratio_sq(2)  : 
tildeBt  :®  meaii_ratio_t(2) : 
tildeBq  :=  mean_ratio_q(2) : 

tildeC  :=  meaii_ratio(3)  : 
tildeCsq  :=  meaii_ratio_sq(3)  : 
tildeCt  :=  mean_ratio_t(3) : 
tildeCq  :=  mean_ratio_q(3) : 

tildeD  :=  mean_ratio(4) : 
tildeDsq  :=  mean_ratio_sq(4) : 
tildeDt  :=  meaii_ratio_t (4) : 
tildeDq  :=  mean_ratio_q(4) : 

tildeGexp  :®  expajidC  tildeG  ); 
tildeGsqexp  ;®  expandC  tildeGsq  ); 

bias  :=  tildeGexp  -  G; 

sqterm  :®  tildeGsqexp  -  Gsq; 
oterm  :=  expand (  2  *  G  ♦  bias  ); 
mse  :=  sqterm  -  oterm; 

« 

#  Simplify  BIAS  and  MSE 

# 

bias  :®  simplifyC  expand(  bias  )  ); 
mse  :»  simplifyC  expandC  mse  )  ); 

bias_cO  :*  simplifyC  coeffC  expandC  bias  ),  iOinv,  0  )  ) 

bias_cl  :»  simplifyC  coeffC  expandC  bias  ),  iOinv,  1  )  ) 

bias_c2  :*  simplifyC  coeffC  expandC  bias  ),  iOinv,  2  )  ) 

bias_c3  :*  simplifyC  coeffC  expandC  bias  ),  iOinv,  3  )  ) 

mse.cO  :«  simplifyC  coeffC  mse,  iOinv,  0  )  ); 
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mse_cl  :=  simplify (  coeff(  mse,  iOinv,  1  )  ); 

mse_c2  :=  simplifyC  coeff(  mse,  iOinv,  2  )  ); 

mse_c3  :=  simplifyC  coeffC  mse,  iOinv,  3  )  ); 

bias_cO  :=  expandC  bias.cO  ); 

bias.cl  :=  expandC  bias.cl  ); 

bias_c2  ;=  expandC  bias_c2  ); 

bias_c3  :=  expandC  bias_c3  ); 

mse_cO  :=  expandC  mse_cO  ); 
mse_cl  :=  expandC  mse_cl  ); 
mse_c2  :=  expandC  mse_c2  ); 
mse_c3  :=  expandC  mse_c3  ); 

bias.expr  :=  bias.cO  +  bias_cl*iOinv  +  bias_c2*Ci0inV2)  +  bias_c3*  Ci0inv“3) ; 
mse.expr  :*  mse.cO  +  mse_cl*iOinv  +  mse_c2*Ci0inv*2)  +  mse_c3*Ci0inv*3) ; 
snr_expr  :=  mse.expr  -  bias_expr“2; 

bias.expr  :=  expzmdC  bias_expr  ); 
mse.expr  :=  expandC  mse_expr  ) ; 
snr.expr  :*  exp2LndC  snr_expr  ); 

latexC  bias.expr  ); 
latex C  mse_expr  ); 
latex C  snr.expr  ); 

quit. 
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Estimator  C2: 


# 

«  File:  ac.mse 
#  Date:  22  Aug  88 
«  Author:  John  D.  Gorman 

« 

read(  visibility  ) : 

ND  :=  proc(k) 

result  ld(k)  +  dd(k) ; 

end; 

NS  :=  proc(k) 

result  :*  ls(k)  +  ds(k); 

end; 

G  :»  expandC  (A  "  C)*2  +  (B  -  D)  2  ); 

Gsq  :*  expand(  G''2  ); 

tildeG  :*  expandC  (tildeA  -  tildeC)“2  +  (tildeB  -  tildeD)-2  ); 
tildeGsq  :=  expandC  tildeG*2  ); 

A  :»  GmCl) ; 

B  :=  GmC2); 

C  :®  GmC3) ; 

D  :=  GmC4) • 

tildeA  :*  0.5  ♦  iOinv  ♦  CNDCI)  “  NSCD); 

tildeB  :*  0.5  ♦  iOinv  ♦  CNDC2)  -  NSC2)); 

tildeC  :*  0.5  *  iOinv  *  CNDC3)  -  NSC3)); 

tildeD  :*  0.5  *  iOinv  •  CNDC4)  -  NSC4)); 

expectddds  :»  procCX) 

local  i,  j,  result,  tl,  t2,  t3; 

option  remember; 

tl  :*  X; 

for  i  from  4  by  -1  to  1  do 
for  j  from  1  by  1  to  4  do 

t2  :■  subsC  C  ddCj)  )”i  “  pcmC  ldCj)»  i  )i  '^1  )> 
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t3  :=  subsC  (  ds(j)  )'i  =  pcm(  ls(j),  i  ),  t2  ); 

tl  :=  t3; 

od 

od 

end; 

tildeGexp  :*  expectdddsC  expandC  tildeG  )  ); 
tildeGsqexp  :=  expectddds(  expand(  tildeGsq  )  ); 

bias  :=  tildeGexp  -  G; 


sqterm  ;=  tildeGsqexp  -  Gsq; 
oterm  ;=  expand (  2  ♦  G  *  bias  ); 
mse  :=  sqterm  -  oterm; 

» 

#  Simplify  BIAS  and  MSE 

« 


bias  :*  simplify (  expand(  bias  )  ); 
mse  :®  simplify(  expandC  mse  )  ); 


bias_cO 

bias.cl 

bias_c2 

bias_c3 


simplify (  coeffC  expandC  bias 
simplify C  coeffC  expandC  bias 
simplify C  coeffC  expandC  bias 
simplify C  coeffC  expandC  bias 


),  iOinv,  0  )  ); 
) ,  iOinv,  1  )  )  ; 
) ,  iOinv,  2  )  ) ; 
)  ,  iOinv,  3  )  ) ; 


mse_cO  :*  simplifyC  coeffC  mse,  iOinv,  0  )  ); 
isse.cl  ;*  simplifyC  coeffC  mse,  iOinv,  1  )  ); 
mse_c2  :®  simplifyC  rc^jffC  mse,  iOinv,  2  )  ); 
mse_c3  ;®  simplifyC  coeffC  mse,  iOinv,  3  )  ); 


bias.cO  ;*  expandC  bias.cO  ) 
bias.cl  :®  expandC  bias.cl  ) 
bias_c2  :®  expandC  bias_c2  ) 
bias_c3  :®  expandC  bias_c3  ) 


mse_cO  :«  expandC  mse_cO  ) 
mse.cl  :*  expandC  mse_cl  ) 
mse_c2  :*  expandC  mse_c2  ) 
mse_c3  :=  expandC  mse_c3  ) 
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bias_expr  :=  bias.cO  +  bias_cl*iOinv  +  bias_c2*(i0inv“2)  +  bias_c3*(i0inv  3); 
mse.expr  :=  mse.cO  +  mse_cl*iOinv  +  mse_c2*(i0inv"2)  +  mse_c3*(i0inv“3) ; 
snr_expr  : =  mse.expr  -  bias_expr'2; 

bias.expr  :=  expandC  bias.expr  ); 
mse_expr  :=  expandC  nise_expr  ); 
snr.expr  :=  expajtid(  snr_expr  ); 

latex (  bias.expr  ); 
latex (  mse.expr  ); 
latex (  snr.expr  ); 

quit ; 
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Abstract— \  Chapman->Robbins  form  of  the  Barankin  bound  is  used 
to  derive  a  multiparamcter  Cramer-Rao  <CR)  type  lower  bound  on 
estimator  error  covariance  when  the  parameter  6  €  ^ "  is  constrained  to 
lie  in  a  subset  of  the  parameter  space.  A  simple  form  for  the  constrained 
CR  bound  is  obtained  when  the  constraint  set  can  be  expressed  as  a 
smooth  functional  inequality  constraint.  H,  =  {6  <  ()}.  >^e  show  that 

the  constrained  CR  bound  is  identical  to  the  unconstrained  CR  bound  at 
the  regular  points  of  0^ .  i.e.  where  no  equality  constraints  are  active. 
On  the  other  hand,  at  those  points  6  €  H,  where  pure  equality  con¬ 
straints  are  active  the  fulUrank  Fisher  information  matrix  in  the  uncon¬ 
strained  CR  bound  must  be  replaced  by  a  rank«rcduced  Fisher  informa¬ 
tion  matrix  obtained  as  a  projection  of  the  full-rank  Fisher  matrix  onto 
the  tangent  hyperplane  of  the  constraint  set  at  0.  A  necessan  and 
sufTicient  condition  involving  the  forms  of  the  constraint  and  the  likeli¬ 
hood  function  is  given  for  the  bound  to  be  achievable,  and  examples  for 
which  the  bound  is  achieved  are  presented.  In  addition  to  providing  a 
useful  generalization  of  the  CR  bound,  our  results  permit  analysts  of  the 
gain  in  achievable  mse  performance  due  to  the  imposition  of  particular 
constraints  on  the  parameter  space  without  the  need  for  a  global 
reparametcrization.  For  the  purpose  of  illustration,  we  apply  the  con¬ 
strained  bound  to  problems  involving  linear  constraints  and  quadratic 
constraints.  Specific  examples  considered  include:  linear  constraints  for 
Gaussian  linear  models,  object  support  constraints  in  image  reconstruc¬ 
tion.  signal  subspace  constraints  in  sensor  array  processing,  and  aver¬ 
age  power  constraints  in  spectral  estimation  and  signal  extraction. 

Index  Terms  ^Constrained  estimation.  Cramer>-Rao  bounds,  multiple 
parameter  estimation,  spectrum  estimation. 


I.  Introduction 

HE  MULTIPLE  PARAMETER  Cramer-Rao  (CR) 
lower  bound  is  widely  used  to  investigate  the  funda¬ 
mental  limits  on  estimator  performance  in  multidimen¬ 
sional  parameter  estimation  problems,  and  in  single  pa¬ 
rameter  estimation  problems  involving  unknown  nuisance 
parameters.  The  CR  bound  or.  estimator  error  covariance 
is  computed  as  the  inverse  of  the  Fisher  information 
matrix  premultiplied  and  postmultiplied  by  the  gradient 
of  the  mean  vector  of  the  estimator.  Although  elementary 
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derivations,  for  instance  [27,  Section  2.4).  may  not  explic¬ 
itly  make  the  assumption,  the  CR  bound  is  typically 
derived  under  the  assumption  that  the  parameter  space  is 
an  open  subset  of  [13.  Section  1.7].  Frequently,  how¬ 
ever,  the  parameter  is  constrained  to  lie  in  a  proper 
non-open  subset  of  the  original  parameter  space  Some 
examples  are:  bandwidth,  support,  and  positivity  con¬ 
straints  in  phase  retrieval  [5],  |d]  and  tomographic  recon¬ 
struction  [24].  [29];  kernel-sieve  constraints  in  probability- 
density  estimation  [25];  array  geometry  constraints  m 
estimation  of  coupled  times-of-arrival  across  multiple- 
sensor  arrays  [28];  and  auto-correlation  lag  constraints  in 
maximum-entropy  spectral  analysis  and  image  reconstruc¬ 
tion  [23).  Constraints  restrict  the  allowable  parameter 
variations  and  hence  the  local  structure  of  the  log-likeli¬ 
hood  function  over  the  constrained  parameter  space  mav 
be  changed.  Specifically,  the  average  curvature  of  the 
log-likelihood  function,  and  in  particular  the  Fisher  infor¬ 
mation  matrix,  may  be  affected,  thereby  invalidating  the 
unconstrained  CR  bound. 

We  present  a  multiparameier  CR  type  bound  for  para¬ 
metric  estimators  when  the  vector  parameter  0  is  con¬ 
strained  to  lie  in  a  subset  0^-  of  We  refer  to  this 
bound  as  a  constrained  CR  bound.  The  constrained  CR 
bound  is  derived  directly  from  a  version  of  the  Barankin 
bound;  the  multiple  parameter  Chapman -Robbins  bound. 
The  tightest  such  Barankin  bound  is  nonincreasing  as  B, 
decreases.  Thus,  in  general,  a  bound  reduction  occurs  as  a 
result  of  incorporating  constraints.  When  6  is  a  noniso¬ 
lated  point  in  a  locally  convex  region  of  0^.  and  the 
log-likelihood  function  is  smooth,  the  constrained  CR 
bound  depends  on  only  through  the  linear  span  ot  a 
set  of  basis  vectors  for  the  region.  When  the  constraints 
on  the  parameter  take  the  form  of  smooth  functional 
inequality  constraints  ./,  <  0  more  explicit  results  are 
obtained.  Specifically,  let  the  inequality  constraint  be 
decomposed  into  a  finite  vector  of  equality  constraints 
C,  =  0  and  a  finite  vector  of  pure  inequality  constraints 
W,  <0  (defined  in  Section  II-C).  Then  the  constrained 
CR  bound  is  obtained  by  implementing  the  classical  un¬ 
constrained  CR  bound  with  a  different  "constrained  " 
Fisher  matrix.  The  structure  of  the  constrained  Fisher 
matrix  depends  on  whether  or  not  0  is  a  regular  point  of 
©<-.  where  a  regular  point  is  a  point  where  no  equality 
constraints  are  active.  As  examples,  points  on  the  interior 
and  points  on  the  boundary  of  open  regions  in  0^  are 
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l>^  " 

regular  points  It  is  shovsn  that  it  0  is  a  regular  point  then 
the  constrained  Fisher  matrix  is  identical  to  the  uncon¬ 
strained  Fisher  matrix  lor  that  point.  Conxerselx.  if  6  is 
not  a  regular  point,  the  constrained  Fisher  matrix  I'n  the 
product  of  the  unconstrained  F’sher  matrix  and  a  6-de- 
pendeni.  rank-delicieni.  idempotent  matrix  whose 
columns  span  a  hxperplane  that  is  tangent  to  the  con¬ 
straint  set  at  6. 

The  constrained  CR  bound  presented  here  has  the 
following  attributes. 

•  For  range  constraints,  orthani  constraints,  positivitx 
constraints,  and  anx  other  constraint  sets  (->,  with  no 
isolated  boundaries,  the  constrained  CR  bound  is 
identical  to  the  unconstrained  CR  bound  restricted  to 
(-)(  .  Hence  the  incorporation  of  these  types  of  con¬ 
straints  provides  no  CR  bound  reduction. 

•  For  constraints  which  restrict  6  to  a  lower-dimen¬ 
sional  manifold  of  parameter  space,  eg.,  through 
active  equality  constraints  of  the  form  G,  =  0.  the 
unconstrained  CR  bound  is  invalid  and  a  reduced- 
rank  Fisher  matrix  must  be  used. 

•  While  an  equivalent  lower-dimensional  uncon¬ 
strained  parameter  estimation  problem  can  some¬ 
times  be  specified  via  a  reparameterization  of  param¬ 
eter  space,  such  a  global  reparametenzation  is  not 
necessary  for  the  computation  of  the  constrained  CR 
bound.  Rather,  the  constrained  CR  bound  only  de¬ 
pends  on  the  local  properties  of  the  constraint  set 
through  Its  tangent  hvperplanes.  Since  the  tangent 
hyperplanes  can  typically  be  computed  much  more 
easily  than  can  a  global  reparameterization  of  param¬ 
eter  space,  the  amount  of  bound  reduction  due  to 
particular  constraints  is  more  easily  analyzed. 

•  Conditions  under  which  the  constrained  CR  bound  is 
achieved  are  similar  to  those  required  for  achieve¬ 
ment  of  the  unconstrained  CR  bound.  Examples  are 
provided  for  which  the  constrained  CR  bound  is 
achievable. 

The  following  geometrical  interpretation  is  helpful  in 
interpreting  the  effect  of  constraints  on  the  CR  bound. 
The  Fisher  information  matrix  J,  being  the  expected 
value  of  the  Hessian  matrix  of  the  (n-dimensional)  log- 
likelihood  surface  at  6.  can  be  related  to  the  average 
curvature  of  the  log-likelihood  surface  at  0  along  n  differ¬ 
ent  directions  in  P.”.  Thus  the  unconstrained  CR  bound  is 
a  function  of  the  variation  of  the  likelihood  surface  over 
an  n-dimensional  neighborhood  of  0.  When  the  parame¬ 
ter  constraint  C„  =  0.  u  e  R".  is  introduced,  local  parame¬ 
ter  variations  will  generally  be  restricted  to  lie  in  a  lower 
dimensional  neighborhood.  This  neighborhood  is  con¬ 
tained  in  the  linear  vector  space  which  is  tangent  to  the 
constraint  set  {u:  G„  =  0)  at  the  point  u  =  0  As  the 
parameter  varies  over  the  lower-dimensional  neighbor¬ 
hood.  only  certain  ‘  constrained"  trajectories  are  tra¬ 
versed  on  the  likelihood  surface.  Thus  the  average  curva¬ 
ture  of  the  surface  appears  different  for  the  constrained 
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parameter,  as  compared  to  the  unconstrained  paramcicr 
tor  which  all  local  traicctorics  arc  allowed.  Thi^  results  m 
a  change  in  the  ass<iciated  Fisher  inlormation  m.iirix  and 
a  different  CR  bound.  This  constrained  CR  bound  de¬ 
pends  on  the  constraint  set  only  through  its  tangent  space 
at  the  point  0. 

It  IS  interesting  to  note  that  tangent  space  approxim.i- 
tions  to  subsets  of  parameter  space  arise  in  gcncr.ii 
asymptotic  statistical  theory  [l.^j.  [Mj  and  specilic  applica¬ 
tions  have  appeared  in  the  statistical  literature  For  exam¬ 
ple  tangent  spaces  arise  in:  the  study  C’]  of  the  asymptotic 
distribution  of  the  likelihood  ratio  for  testing  composite 
hypotheses  involving  smooth  boundaries;  the  studx  [isj  ol 
the  asymptotic  distribution  of  a  specific  estimator  arisine 
in  a  composite  detection  problem  with  inequality  con 
straints  on  the  unknown  parameter;  the  study  l-lj  ol 
asymptotic  efficiency  of  estimators  in  partially  parametric 
rnodels;  the  study  [1]  of  me  asymptotic  distribution  oi 
maximum  likelihood  estimators  subieci  to  cqualitv  con¬ 
straints  W  hile  the  study  of  finite  sample  CR  bounds  and 
the  study  of  asymptotic  properties  of  estimators  ha\e 
points  in  common,  it  is  important  to  distinguish  between 
the  results  of  this  paper  and  the  aforementioned  refer¬ 
ences.  First,  our  result  is  a  general  finite  sample  CR  lower 
bound  on  estimator  covariance  for  fully  parametric  mod¬ 
els.  Second,  the  bound  is  of  a  simple  and  explicit  form 
which  is  useful  for  studying  the  impact  of  particular 
parameter  constraints  on  estimation  error  covariance. 
Third,  while  the  CR  bound  holds  for  any  estimator  whose 
mean  is  smooth,  the  CR  bound  is  not  applicable  to  cases 
where  the  estimator  has  a  nondifferentiable  mean,  such 
as  the  estimator  considered  in  [18].  Furthermore,  since 
the  bound  is  a  finite  sample  bound  on  covariance,  meth¬ 
ods  of  large  sample  theory  are  not  needed  for  our  deriva¬ 
tion  permitting  a  more  elementary,  and  therefore  more 
accessible,  presentation. 

To  illustrate  the  utility  of  the  constrained  CR  bound, 
w-e  investigate  the  effect  of  constraints  on  the  achievable 
estimator  error  for  several  representative  problems  in 
signal  processing.  First  we  consider  the  problem  of  esti¬ 
mation  of  parameters  subject  to  linear  constraints  in  the 
general  linear  Gaussian  model.  For  this  problem  the 
tangent  hyperplanes  of  the  constraint  set  are  functionally 
independent  of  the  parameter  0.  and  hence  the  con¬ 
strained  CR  lower  bound  can  be  achieved  bv  proiectine 
the  unconstrained  minimum  variance  unbiased  (M\L( 
estimator  onto  the  tangent  hvperplane.  The  amount  of 
bound  reduction  depends  on  the  rank  of  the  projection  of 
the  covariance  matrix  of  the  unconstrained  M\'L'  onto 
the  linear  constraint  subspace. 

Second,  we  consider  the  problem  of  image  reconstruc¬ 
tion  subject  to  support  constraints  on  the  image.  The 
constrained  CR  bound  is  equal  to  the  pseudo-inverse  of  a 
constrained  Fisher  matrix,  obtained  by  zeroing  out  the 
rows  and  columns  of  the  unconstrained.  Fisher  informa¬ 
tion  matrix  which  are  associated  with  estimator  errors 
outside  of  the  region  of  support.  It  is  significant  that  this 
IS  not  generally  the  same  as  zeroing  out  rows  and  columns 
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of  the  unconstrained  CR  bound,  unless  the  image  pixels 
are  statistically  independent.  This  establishes  that,  if  an 
efficient  estimator  of  the  unconstrained  image  exists,  ze¬ 
roing  the  unconstrained  efficient  estimator  outside  of  the 
support  region  does  not.  in  general,  provide  an  efficient 
constrained  estimator. 

Third,  power  spectral  density  (PSD)  estimation  subject 
to  average  power  constraints  over  disjoint  frequency  in¬ 
tervals.  called  frequency  bands,  is  considered.  For  the 
case  where  the  unconstrained  Fisher  information  matrix 
is  diagonal,  corresponding  to  large  observation  time,  it  is 
shown  that  tne  constrained  Fisher  matrix  is  block  diago¬ 
nal.  This  means  that  average  power  constraints  effectively 
couple  the  PSD  estimation  errors  over  a  particular  fre- 
quencv  band,  but  do  not  couple  errors  across  different 
frequencv  bands.  Within  a  particular  frequency  band 
where  average  power  constraints  are  active,  our  results 
indicate  that  bound  reduction  is  greatest  over  frequency 
bands  where  there  are  highly  resolved  spectral  peaks, 
while  there  is  virtually  no  reduction  over  bands  where  the 
true  spectrum  is  smooth.  This  suggests  that  average  power 
constraints  make  peaks  easier  to  estimate  but  have  little 
impact  on  the  estimation  of  the  rest  of  the  spectrum. 

Fourth,  the  estimation  of  the  eigenvalues  of  a  struc¬ 
tured  covariance  matrix  subject  to  signal  subspace  con¬ 
straints  IS  considered.  We  put  this  problem  in  the  context 
of  estimating  the  eigenvalues  and  eigenvectors  of  the 
array  covariance  matrix  when  it  is  known  a  priori  that  p 
of  the  eigenvalues,  the  "signal  dependent  eigenvalues,” 
are  larger  than  the  remaining  eigenvalues,  the  "noise 
eigenvalues,"  and  that  these  latter  eigenvalues  are  identi¬ 
cal.  When  the  unconstrained  Fisher  matrix  is  block  diago¬ 
nal.  the  constrained  CR  bound  can  be  achieved  by  averag¬ 
ing  the  noise  eigenvalues  of  an  efficient  unconstrained 
estimator,  if  one  exists. 

Finally,  we  consider  the  problem  of  estimation  of  a 
deterministic  time  varying  signal,  and  its  Fourier  trans¬ 
form.  subject  to  average  power  constraints  applied  to  its 
spectrum  (squared  Fourier  magnitudes).  Unlike  the  PSD 
estimation  problem  previously  mentioned,  here  the  con¬ 
straints  on  the  parameters  (the  signal)  are  nonlinear. 
Nonetheless,  it  is  shown  that  if  the  unconstrained  Fisher 
information  is  an  identity  matrix,  e.g.,  corresponding  to 
observation  of  the  signal  in  additive-white-Gaussian  noise, 
the  structure  of  the  constrained  Fisher  matrix  is  identical 
to  the  structure  found  in  the  PSD  estimation  problem, 
with  the  signal  spectrum  taking  the  place  of  the  PSD. 

An  outline  of  the  paper  is  as  follows.  Section  II  is 
divided  into  several  subsections.  In  Section  II-A  a 
Barankin  lower  bound  on  the  estimator  covariance  is 
given  for  general  constrained  parameters.  In  Section  II-B 
the  constrained  CR  bound  is  derived  from  this  Barankin 
bound  for  locally  convex  regions  of  the  constrained  pa¬ 
rameter  space  In  Section  Il-C  the  constrained  CR 
bound  of  Section  II-B  is  extended  to  the  case  of  smooth 
nonlinear  functional  inequality  constraints.  In  Section  III. 
examples  of  the  implementation  of  the  constrained  CR 
bound  are  presented. 


II.  Lower  Bounds  on  the  Error  Covariance 

Throughout  the  paper  the  notation  0  and  [0,],. ,  „  will 

denote  a  column  vector.  [0,.  ■  ■.0„]^.  of  unknown  param¬ 
eters  contained  in  the  unconstrained  parameter  space 
©  =  For  a  particular  value  of  the  vector  0  we  specify  a 
probability  distribution  P,  governing  the  observations  .X . 
taking  values  ar  in  a  sample  space  H.  The  collection  of 
probability  spaces  =  {(f).  defines  a  0- 

indexed  set  of  possible  models  for  X.  and  is  called  a 
statistical  experiment  over  0.  If  it  is  known  that  0  is 
restricted  to  a  subset  of  0.  called  the  constrained  param¬ 
eter  space  0( .  the  relevant  statistical  e.xperiment  becomes 
the  reduced  set  of  models  £\  =  {(fl.  jt.  P,)),6«,  •  In  this 
context,  the  constrained  parameter  estimation  problem 
can  be  stated  as  follows:  given  a  statistical  experiment 
a  random  variable  X  is  observed  which  has  distribution 
P,;  the  objective  is  to  specify  an  estimator  0  =  0(  A" )  e  0 
for  the  parameter  vector  0.  Define  the  vector  mean 

dcf  *  > 

m,  =  £,{0)  of  0.  where  £,  denotes  expectation  with  re¬ 
spect  to  the  distribution  P,.  The  objective  of  this  paper  is 
to  investigate  the  impact  of  parameter  constraints  on 
bounds  for  the  minimum  estimation  error,  where  error  is 
measured  by  the  covariance  matrix 

(1) 

We  say  that  a  matrix  B  is  a  lower  bound  on  a  matrix  .-1  if 
B  in  the  sense  that  A  -  B  is  nonnegative  definite. 


A.  A  Multiple  Parameter  Barankin  Bound 


We  first  present  a  Chapman- Robbins  version  of  the 
multiple  parameter  Barankin  lower  bound  on  the  covari¬ 
ance  matrix  for  the  case  where  0  e  0^^.  Unlike  the  CR 
bound,  the  Barankin  bound  requires  no  regularity  condi¬ 
tions  on  the  distribution  P,.  To  achieve  a  unified  treat¬ 
ment  of  the  cases  of  continuous  and  discrete  random 
variables  X.  we  let  P,  have  a  density  function  =  /§(  x ) 
with  respect  to  some  reference  measure  u .  P,(  .-1 )  = 
j^ffdp.  where  P^(A)  is  the  probability  th.at  A' €.4. 
/4  e  5^.  For  a  continuous  sample  space  fl  the  previous 
integral  can  be  interpreted  as  the  standard  (Lebesgue) 
integral  over  A.  while  for  discrete  fl.  p  is  the  counting 
measure  and  the  integral  can  be  interpreted  as  a  sum  over 
elements  x  e  >4. 

For  arbitrary  vectors  v,.- •  ■.  w*  e  3?"  and  scalars 
d,.-  -  -  .  Aj  €  R.  define  the  scalar  and  vector  finite  differ¬ 
ences.  6,  ft  and  0,m,,  of  the  density  function  and  of  the 
mean  vector  for  0.  respectively,  which  are  produced  by  a 
change  in  the  underlying  parameter  from  the  (x>ini  0  to 
the  point  0  -i-  A,v,: 


6.f, 


def 


f* 


(2) 


6,m, 


def 


(3) 


These  finite  differences  are  the  variations  in  /,  and  m. 
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along  the  directions  of  the  vectors  w,.  .Vj:  a  set  ol 
vectors  vkhich  are  henceforth  referred  to  as  directum 
I  caon  Define  the  ri'vs  vector  of  k  finite  differences. 

-'V/el-  <-*) 

and  the  n  '  k  matrix  of  finite  differences 

hm,  =  lA|m,,  .Oim,).  (M 

With  these  definitions  we  have  the  following  multiple 
parameter  Chapman- Rohbins  version  of  the  Barankin 
hound  [6],  [17]  when  6  is  constrained  to  lie  in  the  set  (-), . 

Proposition  I:  Let  the  k  1  vectors  6.6  -  AiK,.  ',0-^ 
A,Wi  be  arbitrarx  points  contained  in  the  constrained 
parameter  set  (-),  c  .-.".  Then  for  any  estimator  6  having 
mean  m,.  the  estimator  error  covariance  matrix  I,  satis¬ 
fies  the  matrix  inequality 

(6) 

where 

1  I  ,r 

[  /e  J  [•'«]' 


admissible  test  points  in  the  parameter  space.  Hence 
constraining  the  parameter  space  can  only  reduce  the 
(greatest)  lower  bound  of  the  torm  (ts)  Thus  it  ix  clear 
that  some  bound  reduction  can  occur  due  to  incorpora¬ 
tion  ol  parameter  constraints.  Due  to  the  ditticuliy  in 
finding  the  best  test  vectors  for  (h).  hiiwexer.  the  amount 
ol  bound  reduction  is  difficult  to  quantify  in  general.  In 
the  next  section  we  will  derive  a  constrained  CR  bound  ^s 
a  limning  form  of  the  bound  (b)  for  which  the  impact  ot 
constraints  will  be  much  easier  to  evaluate. 

The  proof  of  Proposition  1  depends  on  the  following 
generalized  version  of  the  Cauchy-.Schwarz  ineoualilv 
Lemma  1:  Let  L  e  r.”  and  L  e  be  random  column 
vectors.  Then 

£,{L'L’M  S  (101 

where  the  plus  sign  denotes  pseudo-inverse.  Moreover, 
equality  holds  if  and  only  if  there  is  an  n  x  k  nonrandom 
matrix  T  such  that  L=  PL  w.p.l. 

Note  that  if  the  k  xk  matrtx  £,{n''^)  is  nonsingular, 
the  matrix  inequality  (10)  is  the  standard  Cauchy -Schwarz 
inequality  for  random  vectors. 


and  the  plus  sign  denotes  pseudo-inverse.  Equality  holds 
in  (6)  if  and  only  if  there  exists  a  nonrandom  n  x  k  matrix 
r  such  that  the  estimator  0  satisfies 

0-m.  =  rf^l  (w.p.l).  (8) 


In  Proposition  1.  the  pseudo-inverse  of  a  matrix  A  is 
defined  as  the  unique  matrix  A  *  that  satisfies  the 
Moore-Penrose  conditions  (2,  Ch.  31,  (21,  Section  LbS]; 

1 )  /t-l  ■  and  A  'A  are  symmetric, 

2)  AA'A^  A. 

3)  = /I*.  (9) 


Proof  of  Lemma  1:  Define  the  vector  Z  = 

Then  £,122^)  >  0  implies  the  matrix  inequality 


£,{22M  =  £,j 


CL^ 

IV^ 


n  ^ 

fV" 


Let  D  be  the  n  X  (n  ->•  k )  partitioned  matrix 

D=[/i-£.{£r^)[£,(n’^)]'], 

where  /  Is  the  n  x  n  identity.  Since  £,{22^1  is  symmetric 
and  nonnegative-definite,  it  has  a  nonneeative  square 
root:  £,{22’')=  £^=^22'■)£;  -(22^).  Thus,  D£,(22'') 
=  [DEi  HZZ'^}lDEi  ■{ZZ’')Y>0.  and  use  of  prop¬ 
erty  3)  of  (9)  results  In 


The  conditions  11-3)  are  a  statement  of  the  fact  that  £,|t/t''^}  -  £,{l/f'^)  [  £,{H'^l]  *  £,{tL  '^)  >  0 

.4A  '  and  A  ~A  are  projection  operators  onto  the  range  of  * 

A  and  /I  *,  respectively.  Pseudo-inverses  always  exist,  are  This  equation  can  be  reexpressed  as  £,((£- Tf )( L - 
continuous  under  certain  conditions  (26],  and  if  A  is  FK)'’)  >  0,  where  T  =  £,(£!' ''l(£,{n' '^11*.  Equality  holds 
invertible  A'  =  A''.  if  and  only  if  the  eigenvalues.  A,,  of  the  matrix  £,|(l'  - 

Before  proving  Proposition  1,  we  make  the  following  TVW  -YVf]  are  zero.  Furthermore,  the  nonnegative 
observations.  Since  only  a  pseudo-inverse  is  required  for  definiteness  of  this  matrix  implies  that  A,  =  =  A^  =  (' 

the  bound  of  Proposition  1.  the  covariance  matrix,  if  and  only  if  0  =  £a,  =  tr( £,{(£'- TElt  f  -  Ti' l'^))  = 
£il^/>//,Yl^/t//»]-  of  the  finite  difference  vector  does  £,((£’-  TfO^d' -  TE)).  Hence,  equality  holds  in  (10)  if 
not  have  to  be  invertible.  This  general  form  is  necessary  and  onlv  if  £'=  TK  w.p.l.  Z 

for  the  present  application  since  parameter  constraints 

can  reduce  the  rank  of  the  covariance  matru.  In  view  of  Dsing  the  previous  Lemma,  Proposition  1  is  proven 
the  definition  (41  of  the  finite  difference  vector  6f,  the 

bound  (6)  is  a  measure  of  the  variation  of  the  probability  Proof  of  Prooosition  /.  Define  the  n-vector  L  and  the 

density  /,  relative  to  the  set  of  "test"  points  9+  k-vector  K 
A,i>|.  •.6-A,»>i.  which  are  arbitrarily  specified  in  the  aei  - 

constrained  parameter  space  0^.  On  the  other  hand.  L'=9-m,. 

since  0^  c  0.  it  is  obvious  that 

max  >  max  B^ . 

H  a, 


where  each  maximization  is  performed  over  the  set  of  where  m,  is  the  mean  vector  of  6  and  6/,  is  the  vector  of 
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finite  differences  defined  in  (4).  With  these  definitions, 
application  of  Lemma  I  gives  a  lower  bound  involving  the 
pseudo-inverse  of  the  kxk  matrix  and  the 

k  X  n  and  n  x  A  matrices  £,|H’')  and  £,{LT'^).  respec¬ 
tively.  If  it  can  !  e  shown  that  £,|LT'}  =  am.  Proposition 
1  would  he  established.  Consider  the  yth  column  of 
£,|LT'^)  and  recall  the  definition  (4)  of  a/,. 


I  J%  I 

(  r  *>  1  y®  •  ^  K  \ 


^  fft  A 


regularity  conditions  [13.  Lemma  8.1].  [27.  Section  2.4). 
the  Fisher  matrix  is  equivalent  to 

V.=  -£.{^-’ln/.).  (I-'I 

where  T-  In  /,  is  the  Hessian  matrix  of  partial  derivatives 
of  In  /,  with  respect  to  elements  of  6.  This  motivates  the 
following  lemma. 

Lemma  2:  Let  the  vector  6  be  in  the  constrained 
parameter  space  0^  cS".  and  let  {v,}*. ,  be  A  linearly 
independent  vectors  such  that  6  -  A  w,  €  0^  for  all  suffi¬ 
ciently  small  A,  >  0.  I  =  1.  -  •  •.  A.  Then  for  any  estimator  6 
having  mean  m,.  the  estimator  error  covariance  matrix 
1,  satisfies  the  matru  inequality 

jef 

I,  >  B,  =  limsup  B^.  (14) 

X,.  .X.  -II 

where  B^  is  the  bound  (6)  of  Proposition  1.  If  in  addition 
the  following  four  regularity  conditions  hold: 

•  6  has  finite  variance;  var  (6,)  <3c;  (15) 

• /,  has  continuous  partial  derivatives:  (16) 


B.  The  Constrained  CR  bound 

We  first  obtain  a  constrained  CR  bound  for  locally 
convex  0,-  directly  from  the  bound  (6).  We  then  show 
that  the  same  bound  holds  for  points  0  e  0^  at  which  ©f 
can  be  approximated  by  a  union  of  locally  convex  sets. 
These  results  are  then  used  in  Section  II-C  to  construct 
CR  bounds  when  0(.  is  specified  by  continuously  differ¬ 
entiable  functional  constraints. 

Let  6  and  the  A  linearly  independent  test  vectors 
6fA,w,.  .0-t-AtWj  be  contained  in  the  reduced  pa¬ 
rameter  space  0<-  for  all  sufficiently  small  A,.  /  =  1,  -.A. 
Such  test  vectors  can  always  be  found  for  points  0  that 
are  in  locally  convex  regions  of  0^  with  dimension  at 
least  A.  Assuming  the  exchange  of  limiting  and  expecta¬ 
tion  operations  is  valid,  the  limit  of  the  bound  B,,  (6)  of 
Proposition  1.  as  A,  -•  0.  i  =  1,-  ■  A,  gives  a  bound  which 
depends  only  on  the  directional  derivatives,  lim^  -o5,/, 
and  lim^  of  /,  and  the  mean  vector,  m„  along 

the  directions  of  the  vectors  w,,  i  «  1,-  •  A,  at  the  point  0. 
Specifically,  by  the  chain  rule  we  would  have; 

.a, and  I'nix,.  .0, 
where  /f  =  [w,,- • -.Wj]  is  the  nxk  matrix  of  direction 
vectors:  V/,  is  the  1  x  n  (row-vector)  gradient  of  and 
Tm,  is  the  nxn  matrix  whose  rows  are  the  gradient 
vectors  associated  with  each  scalar  component  of  m,.  If 
we  could  substitute  the  above  limiting  expressions  into 
the  right-hand  side  of  (6)  we  would  obtain 

l,>[^m,]K{K%Ky  (H) 

where 


=  £.{[rin/,nvin/,]}  (12) 

is  the  nx  n  Fisher  information  matrix.  Under  appropriate 


' 


ne. 


<  *; 


(17) 


•  the  matrix  £,{(r  In  T  In  /,]}  is  positive  definite: 

(18) 


then 

(1*^) 

where  7,  is  the  positive  definite  nxn  Fisher  matrix  (12). 
and  A  is  any  nxn  matrix  whose  column  space  equals 
spaniv,,-  - v,().  Under  these  regularity  conditions,  equal¬ 
ity  is  achieved  in  the  lower  bound  (14)  if  and  only  if  there 
exists  a  non-random  nXn  matrix  F  such  that: 

0-m,=  r/l’'(Vln/,]’'  (w.p.l).  (20) 

If  such  an  estimator  0  exists,  this  estimator  is  called  an 
efficient  constrained  estimator. 


Proof  of  Lemma  2:  By  assumption,  0  +  A|W,,  .0  - 

A^w,  are  contained  in  0(-  for  all  A,  sufficiently  small, 
i  »  L-  •  -.A,  and  the  bound  (14)  follows  directly  from  the 
Barankin  bound  of  Proposition  I. 

The  regularity  conditions  (15)-(17)  ensure  that  the 
Fisher  matrix  7,  (12)  exists  and  has  bounded  elements 
(13,  Section  1.7],  and  condition  (18)  says  that  7,  is  positive 
definite. 

lits  as  A,.-  ■  .  A,  —  0  of  the 


matrices  £, 


if,' 

T 

if. 

7. 

,  /. , 

and  5m,  under  the  stated  regular¬ 


ity  conditions  of  Lemma  2.  Define  A  =  max,IA,l.  Let  K 
be  the  nxk  matrix  with  columns  w,.-  ■  -  .k,.  By  condition 
(16)  and  the  chain  rule 


X|. 


IlID  — 

/« 


-r-yr., 

Tin  f,K. 
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From  this,  and  the  stated  continuity  of  V  In  /,.  condition 
(16).  the  vth  element  of  [  —  1  [—1  is  dominated  by 


I  I s^hich  has  finite  ex¬ 
pectation  by  condition  (17).  Hence,  by  dominated  conver¬ 
gence  [3.  Theorem  16  4).  we  have  the  finite  limit 


[6/.r 

6/, 

f» 

/.  . 

matrix  and  [  A' 1' =  ( A' 'y,A  j ' '  Since  the  matrix 
A'y,A  IS  symmetric  and  positive  definite  the  eigenvalue^ 
of  the  perturbed  matrix  K'J^K^  E  are  positive  tor  .i 
sufficiently  small  matrix  perturbation  E  [12.  Ciirollarv 
6,3.4).  This  implies  that  the  inverse  of  K’J^K  is  coniinu- 
ous  in  perturbations  of  its  elements 


E,  —  —  =\K  J,K~0\^)\ 

Jb  .'9  I 


=  A^£,([Vln/,]'[Vln/,]jA 


AV.A 


0(1).  (231 


=  A'y.A, 

Next  consider  the  n  x  A:  matrix 

6  ffi  n  “* 

L  '  J/ “  J 


f  •  -  a  .  f  I  1  ) 


where  the  last  equality  results  from  the  identity  £,j  ^ j  = 

0.  Now  from  condition  (16)  the  elements  of  the  nxk 
matrix  are  equal  to  the  elements  of 

(e  -  m,)T  In  /,A  to  order  0(A).  The  Schwarz  inequality 
and  the  regularity  conditions  (15)  and  (17)  can  be  used  to 
establish  that  the  elements  of  the  latter  matrix  have  finite 
absolute  expectation 


where  0(A)  and  oi  1 )  are  matrices  whose  elements  are  ot 
order  0(  A )  and  of  order  o(  I ).  respectively.  In  view  ot  (21 1 
we  therefore  have 

Iimsup6^=  lim  [6mg] 


I  6/,  (S/,  I 

hm  \Et  —  — 

4,.  .4.-0)  [  /,  J  [  /»  J( 

lim 

4,.  4.-11 

=  [Vm,]A[A^y,A]' A'’[Tm,]'‘.  (24) 

It  remains  to  show  that  the  bound  (24)  depends  only  on 
the  range  space  of  A  =[i>|.  -  ■  .vj.  Let  /4  be  an  n  x  n 
matrix  whose  column  span  is  identical  to  the  span  of 
v,.  ''.vj.  Since  the  column  spaces  of  A  and  A  are 
identical,  there  exists  an  invertible  nx  n  matrix  T  such 
that 

[A  o,]r  =  /i. 

where  (?,  is  an  n  x(n  -  A)  matrix  of  zeros.  Let  O-  and 
Ox  bt  (n  -  k)x(n  -  k)  and  k  xin  -  k)  matrices  of  zeros, 
respectively.  Then. 


i£,{{e-m,)V  In/,},  £,  (e, -[m,],) 


A[A^J,Ay 
=  IA  0,1 


<  var’'^’  {0,]Ey~ 


Hence,  by  dominated  convergence,  the  limit 
lim  dm, 

4,.  .4.-0 

exists  and  is  equal  to  the  finite  matrix 


=  (A  O.j 


=  A  O, 


%i.\K  o. 


K%K  Ox 

o\  O- 


o,]r 

rfA-i 


lim  6m,  =  £,{6—}  A 

4,.  .4,-0  )  /,  j 

=  V£,{e)A 

“Vm,A.  (22) 

Since  the  columns,  (v,),*. ,,  of  A  are  linearly  indepen¬ 
dent,  by  condition  (18)  K^J^K  is  a  full  rank  invertible 


or 


Ox 

o,  or 


=  A[aV,a]*  A^ 


where  the  second  equality  follows  from  (65)  of  Lemma  5 
in  the  Appendix. 

The  condition  for  equality  in  the  bound  ( 14).  under  the 
regularity  conditions  (15)-(I7).  can  be  obtained  by  mak- 
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ing  the  identificatmns  t'  =  (0-8).  I' =  K '[V  In /,)'  in 
Lemina  1.  \erit\ing  that  the  right  side  of  the  resultant 
bound  I  ID)  IS  identical  to  the  right  side  ol  the  bound  1 14) 
and  insoking  the  necessars  and  suflicient  condition  lor 
equalits  in  t|D):  L  =  I  I  lor  some  k  >  n  matrix  I'  This 

gives 


itx  of  the  inverse  of  the  full  rank  matrix  K'J^K. 


=  ~o{  III 


=  [K’J,K\  *0(1).  Cb) 


H  - /n,  =  ITi '[Tin ( vv  p  I ) 

Since  ,4  has  the  identical  column  span  as  K.  the  above  is 
equivalent  to  condition  (20)  Z 

The  constrained  CR  bound  (ID)  ot  Lemma  2  is  in  a 
general  form  that  is  applicable  tvi  nonisolated  points  6  in 
localK  convex  regions  of  the  parameter  space  H,  It  is 
sicnificant  that,  unlike  the  Barankin  bound  of  Proposition 
I.  the  constrained  CR  bound  (ID)  onlv  depends  on  the 
test  points  through  the  span  of  the  set  In,.  In 

particular,  when  (■),  is  onlv  p-dimensional  in  the  neigh¬ 
borhood  of  0.  and  p  <  II.  all  />-dimensional  sets  of  test 
points  are  equivalent  in  the  sense  that  the  limit  ( 19)  of  the 
Barankin  bound  is  the  same. 

The  construction  of  Lemma  2  requires  that  (-),  be 
locallv  convex  or  star-shaped  in  the  neighborhood  of  0. 
Lemma  2  can  be  extended  to  include  nonisolated  points 
in  regions  of  (-),  that  have  the  property  that  local  neigh¬ 
borhoods  can  be  approximated  to  order  o(.i)  by  locally 
convex  neighborhoods.  The  result  is  the  following  lemma. 

Lemma  3:  Let  the  vector  0  be  in  the  constrained 
parameter  space  (-)<■  c.-.".  and  let  be  k  linearly 

independent  vectors  such  that  0 -►  Jt,w,  -  o(2i,)e0(-  for 
all  A,  sufficiently  small.  /  =  1.  .^.  where  o(2i,)  is  a  x" 

vector  whose  length  is  of  order  o(.i,l.  Then  the  conclu¬ 
sions  of  Lemma  2  remain  valid  when  the  vectors  0  *  A,v, 
are  replaced  by  0  -  -►  otA,),  /  =  1. ■  ■  -  .A:. 


Proof  of  Lemma  3:  Similarly  to  (2).  let  5'/,  denote 
the  A-length  vector  of  scalar  differences  6'/,  = 
[d;/,.-  ■  -.d;/,]  where 


a 


f  -Ota  I 


(25) 


Define  S'm,  similarly.  Let  B;  denote  the  Barankin  bound 
of  Proposition  1  formed  with  the  k  test  points  10  -► 

o(  ),  .0  *  AjWj  ^  )|.  We  need  to  establish  that 

the  limits  limsupj  ^  B;  and  limsup^^  a. 
are  identical. 

By  assumption  (lb)  /,  is  continuous  and  therefore: 
/•-a  1.  -oia  I  ~  /s-a  »  n(-i,).  In  view  of  (25)  this  implies 


where  ot  1 )  is  a  matrix  that  has  ot  1 )  entries  that  go  to  zero 
as  the  i  s  eo  to  zero.  In  a  similar  manner  it  can  be  shown 
that  d/n,  =  T/n,A'  *  o(  1 ).  which,  when  taken  with 
(2r>).  implies  B;  =  B  *  o(  1 ).  This  establishes  the  lemma. 


C  Funcuonal  Constraints 

Often  the  constrained  parameter  space  (-)^  can  be 
defined  in  terms  of  an  implicit  functional  inequality  con¬ 
straint  of  the  form 

.<•,<().  (27) 

where  .F  =[S' .  ■  is  a  vector  function  on  r.” . 

.( :  '?”  —  7.“.  and  the  inequality  is  to  be  interpreted  ele¬ 
ment  by  element.  We  will  assume  that  the  inequality 
constraints  are  consistent,  i.e..  there  exists  at  least  one 
0e?.'’  that  satisfies  (27).  and  that  .(  is  continuously 
differentiable  in  the  sense  that  the  qy.  n  gradient  matrix 


- 1 

1 

1  . 

1 

1 

9SS 

I07 

exists  and  has  continuous  elements. 

With  the  parameterization  (27)  of  0^.  the  boundary  of 
©f  is  defined  as  the  set  of  points  where  at  least  one 
component.  of  the  vector  function  is  equal  to 
zero.  The  interior  of  ©,;  is  defined  as  the  set  (0:  <  0|, 

where  the  strict  inequality  means  <0.  for  each  i  = 
1.-  •  .<?. 

Note  that  equality  constraints  can  be  imbedded  in  (27) 
by  letting  cf,'  =  -  for  some  i.j.  t  *  j.  It  is  customary 
to  extract  the  equality  constraints  from  the  inequality 
constraints  (27).  denoting  what  remains  as  pure  inequality- 
constraints.  This  yields  the  equivalent  description  of  ©(- 

G,  =  0.  (29) 


of,  1 

f,  ft 


/•  -  i  •  ~  f,  ) 


^  1 
/.  */. 


=  Vln/,A'  +  0(1). 


Using  the  definition  of  the  Fisher  matrix  and  the  continu- 


«,<0.  (30) 

where  C  « [C,  .C‘]^  and  H  ^[H'.  ■  .H'V  are  vec¬ 
tor  functions  of  0.C:  We  will  say 

that  the  equality  constraint  (29)  is  acm  e  if  it  restricts  0  to 
a  lower  dimensional  subset  of  R”.  Otherwise  the  equality 
constraint  is  said  to  be  inactire. 

The  decomposition  (29)  and  (30)  is  accomplished  by 
partitioning  the  constraint  set  @(-  into  a  set  of  regular 
points  and  nonregular  points. 
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Dcfinilion  jib.  Section  'iAj:  The  point  6  ,  €?."  is  called 
a  regular  point  of  the  mequalii)  <  0  (a  regular  point  of 
the  constraint  set  (-),  I  if:  T,  <  (I  and  if  there  exists  a 
V  £  such  that  (f,  ^  TT,  v  <  0. 

There  can  be  no  active  equality  constraints  at  a  regular 
point  H  Specifically,  it  can  he  shown  that  0,,  is  a  regular 
point  of  (-),  if  and  only  if  <  d  tor  some  u 

and  all  sufficiently  small  Jt  >  (I  (see  proof  of  Lemma  4). 
This  implies  that  there  exits  a  sequence  of  interior  points 
le.g.,  {0,,  -  :wl„>  that  converge  to  0  Hence  regular  points 
are  points  that  are  in  the  closure  of  the  interior  of  In 
particular,  all  interior  points  of  W,  are  regular  points  and 
points  on  the  houndarv  of  pure  inequality  constraints 
He  <  0  arc  regular  points.  See  Figs.  1  and  2  for  graphical 
illustrations. 


Ftg  1  Equjlil\  conslrjinl  O’,  =  (6.  -  (#',  )■  -  a-  =  0  Here 

6  can  onK  var>  alone  houndan  of  dI^k  Set  of  admissible  directions, 
(v  I.  in  Wihich  parameter  can  move  must  lie  on  tangent  hyperplane 
Since  t),  has  no  interior  points,  there  are  no  regular  points  of 
constraint  set. 


_ 

Fig  2  InequaliiV'COnslraint  W,  <  u.  where  - 

flyi-  -fl*  Here  0  can  move  into  interior  of  disk  Set  of  admissible 
directions  is  contained  in  half*space  that  is  supported  by  tangent 
hyperplane  Since  any  point  0  €  6^  can  be  represented  as  a  limit 
of  interior  points,  all  points  in  0^^  are  regular  points. 

The  following  Lemma  shows  that  if  0  is  a  regular  point 
of  the  constrained  CR  bound  is  identical  to  the 
unconstrained  CR  bound. 

Lemma  4:  Assume  that  the  conditions  (I5)-(I8)  of 
Lemma  2  hold.  Let  the  parameter  space  0^  be  defined  bv 
the  general  inequality  constraint  ^,<0  where  the  vector 
function  =  {J  •  ■.S‘'Y  is  differentiable.  Let  9  be  a 
regular  point  of  0^^.  Then  for  any  estimator  6  having 
mean  wi,.  the  estimator  error  covariance  matrix  satis¬ 
fies  the  classical  unconstrained  CR  matrix  inequality 

(31) 

where 

(32) 


s. 

and  is  the  Fisher  matrix  (12).  Equality  holds  in  (.'ll  it 
and  only  if  there  exists  an  n  n  matrix  F  such  that 

0  -  m,  =  r[  A  In  /,]^  ( I 

It  such  an  estimator  0  exists,  it  is  called  an  efficient 
unconstrained  estimator. 

Proof  ot  Lemma  4:  Since  0  is  a  regular  point,  there 
exists  a  ve  such  that  lor  all  A.  0  <  A  <  1.  we  have: 
( 1  -  A  <  0  and  A[./»  -  ^ ifv]  <  0.  Hence  ( 1  -  A)  - 
A[./,  *  -  ^-.^svA  <  0  Since  for  fixed  v 

V-i  =0(1). 

It  follows  that  for  all  sufficiently  small  A.  /j-av  d  Id  ^ 
similar  manner,  it  can  be  verified  that  there  exists  an 
«  >  0  such  that  for  all  |  with  length  <  1 

■^e-ai.-.4i 0-  for  all  sufficiently  small  A  >  0.  (.^4) 

that  is.  0  Av  is  an  interior  point  of  0^ .  Choose  n 
linearly  independent  unit  length  vectors  4,.  ■  and 
define  v  =  i>  -  f|,.  i  =  \.-  .n.  Then,  using  (34)  it  is  seen 

that  (0  Av,),”.  I  is  a  set  of  n  linearly  independent  vectors 

contained  0(  for  all  sufficiently  small  A  >  0.  Application 
of  Lemma  2  thus  gives  the  lower  bound  on  the  covariance 
matrix 

fi,  =(rm,].4[.4V,H]'.4^[rm,]^ 

where  A  is  any  nx  n  matrix  with  identical  column  space 
as  (v|.-  ■  -.Vn].  But  the  column  space  of  this  latter  matrix 
IS  identical  to  by  linear  independence  of  the  v  s.  .so 
taking  /I  »  /  in  the  previous  equation  for  B.  we  obtain 

The  bound  (31)  of  Lemma  4  is  identical  to  the  classical 
mulliparameter  unconstrained  CR  bound  121).  [27).  Since 
no  equality  constraints  can  be  active  at  the  regular  points 
of  ©( .  the  Lemma  establishes  that  pure  inequality  con¬ 
straints  on  0  do  not  affect  the  CR  bound  on  the  error 
covariance  of  estimators  having  a  given  mean  gradient 
Vm,.  A  number  of  parameter  estimation  problems  have 
parameter  constraint  sets  for  which  all  of  the  points  are 
regular.  Examples  include:  orthant  constraints,  e.g..  posi¬ 
tivity  of  each  of  the  elements  6,  in  the  parameter  vector 
6:  range  constraints,  e.g..  magnitude  of  6,  less  than  1; 
length  constraints,  e.g..  E". ,0r<l.  For  these  types  of 
constraints  the  classical  unconstrained  CR  bound  applies 
to  all  points  in  0(-. 

On  the  other  hand,  many  estimation  problems  are 
formulated  with  parameter  constraint  sets  for  which  some 
or  all  of  the  points  are  not  regular.  In  particular,  as 
previously  mentioned,  for  the  case  of  active  equaiity  con¬ 
straints  (29).  if  0f-  is  a  A-dimensional  surface,  k  <  n.  then 
0c  contains  no  regular  points.  Examples  of  these  prob¬ 
lems  are  provided  in  Section  III  of  this  paper.  For  this 
case,  the  classical  CR  bound  is  invalid  and  bound  reduc¬ 
tion  occurs  due  to  the  constraints. 

We  now  consider  the  construction  of  a  CR  bound 
under  continuously  differentiable  equality  constraints.  As- 
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sume  the  equality  constraint  G,  =  tl  (Z'i)  is  active  at  6. 
Define  the  k  x  n  gradient  matrix.  TC,,  of  the  function  G. 
Also  define  the  hyperplane,  ./i',.  tangent  to  the  constraint 
set  t)(  at  the  point  6: 

.//,  =  {  v  e VG,y  =  ()}.  (.^5) 

If  G  is  a  lineal  function,  e  g..  G,  =  for  some  n  x  k 
matrix  F.  .//„  =  <-),  .  Otherwise,  when  G  is  a  continuously 
differentiable  function,  any  set  of  points  in  (-»,  that  are  in 
the  local  A-neighborhood  of  the  point  0  e  (-),  are  approx¬ 
imated  to  o(A»  by  a  set  of  points  in  the  tangent  hy¬ 
perplane  Using  Lemma  this  implies  that  the 

constrained  CR  bound  B  (6)  depends  on  the  equality- 
constraint  function  G  only  through  its  associated  tangent 
hyperplane  at  the  point  6 

The  constrained  CR  bound  for  smooth  inequality  con¬ 
straints  is  given  in. the  following  theorem. 

Theorem  I:  Let  the  regularity  conditions  (15)-(18)  of 
Lemma  2  be  satisfied.  Let  the  parameter  space  0^  c.-.'’ 
be  defined  by  the  consistent  set  of  equality  and  pure 
inequality  constraints:  G,  =  0.  W,  <  0.  where  the  vector 
functions  G  =  [C,-  -.CM^  and  H  =[H' .  ■  ■ .  FI'Y  are 
continuously  differentiable.  Assume  that  the  k  y.  rt  gradi¬ 
ent  matrix  VG,  has  rank  p.p  <  k.  Then  for  any  estimator 
6  having  mean  m,.  the  estimator  error  covariance  matrix 
satisfies  the  matrix  inequality 

(36) 

where 

B,  =  .  (37) 

and  is  the  n  X  n.  idempotent.  rank  n  -  p  matrix 


6  e  0( .  Bv  the  assumed  continuous  differentiability  of 
C,.TC,4  =o(^) 

0  =  G,. j  -  G, 

=  TG,|-o(:i«  ) 

=  rG,§-o(A).  (401 

where  o(.M  is  a  vector  of  length  o(.M.  Now  define  P = 

/  -  VG^I  VG,VG^]' TGg.  P is  an  orthogonal-projection 
operator  onto  the  null  space  of  TG,.  i.e..  onto  [21. 
Section  lc.4j.  This  induces  an  orthogonal  decomposition 
of  I  =  |(A)  relative  to  ^  =  P^J^  ^\l  -  P^Ji  From 
(40).  ( /  -  P^^  14  =  TG^l  VGgVG^  1 '  VCg?  =  o(  A )  so  that 

4  =  -  o( -^ )  (-11) 

Hence  to  order  A.|  is  equal  to  the  vector  P^^g  that  is 
contained  in 

Now  let  16  1- 1  ( A,  ))*.  I  be  k  sequences  in  t-i^  indexed 
by  A|, ■■  .A,  such  that  i=\.  .k. 

where  V|.-  -.Wj  are  fixed  linearly  independent  vectors 
and  0  <  k  <  n  -  p.  Since  G,  is  continuously  differentiable 
and  has  dimension  n  -  p.  such  sequences  exist  [8. 
Prop.  26.1].  Hence,  in  view  of  (41).  for  fixed  A,.-  ■  .Aj 
the  k  test  points  0  +  .6  are  equal  to 

6 A|W| -1- o(A|).-  .6 AjKj  -t- t)(Ai ).  Define  B^(0-> 

||(A,).- ■  ■.6 ->■  ^jIAj ))  the  Barankin  bound  of  Proposi¬ 
tion  1  evaluated  at  these  test  points  and  define 
■  -  .v^)  the  CR  bound  of  Lemma  2  evaluated  with 


if. 


if  0  is  a  regular  point  of  ©<- 


(/-y,-'[rG,]M[VG,]/,-'[VG.]n*(VG,).  otherwise. 


(-38) 


Furthermore,  equality  holds  in  (36)  if  and  only  if  there 
exists  an  /!  X  n  matrix  F  such  that 


0-m,  =  re;[Vln/,]’'  (w.p.l). 


(39) 


the  direction  vectors  v,. •  •  -  .Wj,  Lemma  3  implies 
5g(0  +  |,(A,).'-.0-5JAJ) 

=  5^(  IF,.  -  ■  -  .  Vj )  -1-  1) 

=  [Tm,]/l[ /jT] -I- o(  1 ).  (42) 


If  such  an  estimator  6  exists,  it  is  called  an  efficient 
constrained  estimator. 

Proof  of  Theorem  I:  For  the  case  that  0  is  a  regular 
point,  in  view  of  Lemma  4.  there  is  nothing  left  to  prove. 
Conversely,  suppose  that  6  is  not  a  regular  point.  We  will 
show  that  any  sequence  of  lest  points  in  ©f  that  con¬ 
verges  to  0  approximates  an  equivalent  sequence  in  JF,. 
Then,  for  0  <  k  <  n  -  p.  we  define  k  sequences  of  test 
points  in  ©^  whose  associated  approximating  sequences 
in  converge  to  6  along  linearly  independent  line 
paths  0*  A,v,.-  .0-i-AjV,.A,.  -  .A* -0.  Fi¬ 

nally.  with  Bf,  the  Barankin  bound  (7).  we  show  that 
limsupfl^  is  equal  to  the  expression  (37)  for  B,.  where 
the  "limsup"  is  taken  over  all  such  sequences  of  test 
points. 

Let  I  =  |(  A )  be  a  vector  such  that  II5(  A  )||  s  A  -•  0  and 
assume  that  9  +  i  is  a  vector  in  ©^  that  converges  to 


where  g(l)  is  a  matrix  of  o(l)  elements  that  go  to  zero  as 
the  A,’s  go  to  zero,  and  A  is  an  n  x  n  matrix  with  column 
space  equal  to  the  span  of  v,. ■  •  -  .Wj. 

Next  we  show  that  if  v,.  -  ■  -  .Vj  and  v].-  ■  -  .v;  are  sets 
of  vectors  in  such  that  span  { v,. ■  ■ .  vj  d 

span(v;.-  -.v^)  then  A[A^J,AVA^  ^  B[B^J,B]' B^. 
where  A  and  B  are  n  x  n  matrices  which  have  identical 
column  spaces  as  spaniv,.- • -.Vj)  and  spanlv;.- •  v^), 

respectively.  Since  by  definition  v,  e  i  =  1.  .k.  this 

will  establish  that  the  matrix 

on  the  right  of  (42)  is  maximized  when  the  column  space 
of  A  is  equal  to  With  7,” '  ‘  the  positive  square  root 
matrix  corresponding  to  7,"'.  the  previous  relation  be¬ 
tween  the  two  spans  holds  if  and  only  if  spaniy,  '  -'iF,. 

7^’^‘v*)3span(y,‘'''-v;.- • -.y,’'  -v;).  Hence  it  is 
sufficient  to  show  that  A[A^AY  A^  >  B[B^BY  B^  when 
the  column  space  of  A  contains  the  column  space  of  B. 
Now  A[A^AYA^  and  I  ~  B[B^BY  are  idempotent. 
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i:^4 

symmetric,  orthuconal-projectiiin  matrices  onto  the  col¬ 
umn  space  of  -f  and  the  null  space  of  B  [21.  Section 
lc.4).  respectivelv.  Therefore,  since  the  column  space  ot 
,4  contains  the  column  space  ot  B.  •fl  .-f^.-fj  A' B  =  B  and 
B’A[A'A}'A'  =  B'.  Since  idempotenl  matrices  are  non- 
necatise  definite,  it  tollows  that  .-f|  .-f  .4 1  -f^  - 
BIB'BVB'  =  AlA'Al'A'l/  -  BIB'B]-B')=  -f[.-)'4|- 
.-f'l/ -  filB'Bl' B' l-fl -f'-fl'.-f'-  '^hich  IS  nonnegative- 
definite.  Therefore  \se  have  from  (42) 

limsup  B,  =  [  Vm,].'f[  .•f'Vj.'f]  (4.4) 

where  .4  is  a  matrix  whose  column  span  equals 

Finally  we  show  that  the  column  span  of  (.4K)  is 
equal  to  and  that,  setting  .4  =  Q,  in  (44).  wc  obtain 
(.44).  Since  TC,  has  rank  p.  there  exists  a  row-echelon 
representation 


where  T  is  a  nonsingular  k  x  k  matrix.  B  is  a  px  n 
full-row-rank  matrix,  and  O,  is  a  (k-p)xn  matrix  of 
zeros.  Let  0-.  O,  and  Oj  denote  matrices  of  zeros  having 
dimensions  (k  -  p)xi k  -  p).  (k  —  p)x  p  and  k  x  n.  re- 
spectively.  Use  of  (38)  and  (65)  of  Lemma  5  in  the 
Appendix  results  in 


\\B]_\BJi'B^  ot 
[OiJ  [o,  o, 

BJi'B^  0[]'[6] 

O,  0,J  [0|J 


=  0.. 


where  the  invertibility  of  the  full  rank  px  p  matrix 
BJ^'B^  has  been  used  on  the  third  line  of  this  equation. 
This  establishes  that  the  columns  of  Q,  are  contained  in 
the  hyperplane  .if',.  A  straightforward  calculation  shows 
that  =  Q,  and  QlQl  =  Ql,  i.e.,  both  Q,  and  Ql 
are  idempotent.  Hence  the  rank  of  is  equal  to  its  trace 
rank((3,) 

=  tr{e.) 

=  tr{/-y,-'[TG.]"{[VG,]y,-'[TG,]")*(VG,]} 

-n-tr([VG,]y,-'[rG.]"{[VG.]y,-'[VG.l")*} 

=  n  -  p. 


and  has  n  -  r  lincarK  independent  columns  Since 
these  columns  are  contained  in  //„  and  since  n  ~ 
rank  IVG*)  = /I  - />  is  the  dimension  ot  this  cst.ih- 

lishes  that  the  column  space  ol  is  identical  to 

Hence,  usine  4  =  Lemma  2.  we  obtain  the  hound 

Now  It  IS  evident  from  symmetry  that  =  K  LK 

Define  J,,'=  One  can  verify  that  the  matrix 

Q»Jt  '  -  oh  satisfies  the  Penrose  conditions  tui  tor 
the  pseudo-inverse.  7,' .  of  7,,.  Using  these  results  and  the 
fact  that  and  Qh  are  idempotent  results  in 

C^[Gi^c>,]'C)'  =  CVoC>: 

=  'Qh 

=  Q»QJ»  ' 

=  C?A' 

Hence  (37)  is  established.  _ 

In  reference  to  Theorem  1  we  make  the  following 
remarks. 

Remark  I:  If  the  set  of  constraints  G,  =  0  is  defined  so 
that  the  rows  of  VG,  are  linearly  independent,  the  k  r  k 
matrex  lTC,]7g  '[TG,]^  will  be  of  full  rank  and  (f*  L4S) 
will  only  involve  the  more  familiar  inverse  matrix 
{(TG,J7,' '[VG,]'^)  ■ Although  a  reformulation  eliminat¬ 
ing  redundant  constraints  can  always  be  accomplished, 
frequently  the  most  natural  description  of  a  constraint 
involves  a  rank-deficient  VG,.  e.g..  see  Example  4  of 
Section  111.  In  this  case  the  general  result  of  Theorem  I  is 
applicable. 

Remark  2:  Comparison  between  the  bound  of  Lemma  4 
and  the  bound  of  Theorem  1  indicates  that  the  presence 
of  constraints  on  the  parameter  space  has  the  effect  of 
reducing  the  rank  of  the  Fisher  information  matrix.  In 
particular  if  the  k  equality  constraints  G,  =  0  reduce  the 
dimension  of  the  parameter  space  from  n  to  /i  -  p  then 
the  rank  n  inverse  Fisher  information  7," '  becomes  the 
rank  n  -  p  inverse  cons/ramed  Fisher  information  '. 
Hence  active  equality  constraints  have  the  effect  of  reduc¬ 
ing  the  rank  of  the  Fisher  information  matrix.  In  the 
proof  of  Theorem  1  it  was  shown  that  the  column  span  ol 
Qf  is  the  tangent  hyperplane  .if',,  and  that  0,7,“'  = 
QiQlJtQtV  Ql-  Furthermore,  by  Lemma  2, 

Q,\QlJ,Q.V  Ql  -  -A' 

if  A  has  the  same  column  span  as  0,.  Using  these  facts 
we  have 

QJi'-Pe,\PlMr.y  P’r.- 

where  B,^=  /  -[rG,r([VG,][VG,r)*[^G,]  is  the  n  x  n 
orthogonal-projection  matrix  that  projects  vectors  in 
onto  Hence  the  inverse  constrained  Fisher  matrix 
QfJi '  is  obtained  from  a  projection  of  the  rows  and 
columns  of  the  unconstrained  Fisher  matrix  7,  onto  the 
tangent  hyperplanes  of  the  constraint  set. 
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Remark  .1  The  matrix  B,  (37)  in  Theorem  1  can  be 
represented  as  the  quantity 

B,  E![  Tin  ^P^l'lrin I 

where  B,  is  the  proiection  operator  defined  in  Remark 
2  The  vectors  and  T  In are  the  projec¬ 

tions  of  the  unconstrained  gradients  of  the  mean  and 
log-likelihood  (score)  functions  onto  the  constraint  tan¬ 
gent  hvperplane  .rl',.  that  is.  these  vectors  correspond  to 
constrained  gradient  sectors.  In  [lO]  these  constrained 
gradient  vectors  were  used  along  with  Lemma  1  to  give  an 
alternative  derivation  of  the  inequalitv  >  fl, . 

Remark  4:  Theorem  1  indicates  that  a  certain  bound 
reduction  is  induced  by  adding  constraints  on  6.  In  partic¬ 
ular.  It  is  easv  to  show  that  the  constrained  CR  bound  B, 
of  Theorem  J  is  always  less  than  the  unconstrained  CR 
bound  B„  in  the  sense  that  B„  -  B,  is  nonnegative  defi¬ 
nite.  This  follows  from:  1)  the  idempotenee  of  /-(?»: 
2)  the  symmetry  of  J,'  and  which  imply  that 

(  /  -  C,)},"  '  =  7,'  '(/  -  Q^)^:  and  3)  the  nonnegaiive  defi¬ 
niteness  of  7,*'.  In  particular,  for  unbiased  estimators 
Vm, =  /  and 

B,  =  Q,Ji ' 

=  (44) 

An  important  implication  of  (44)  is  that  the  incorporation 
of  constraints  can  only  reduce  the  CR  bound  on  the 
component  error  variances. 

Remark  5:  In  many  examples  of  interest  Q,  is  nondiag¬ 
onal.  accounting  for  the  functional  relationships  between 
individual  components  of  0  introduced  by  the  constraint. 
Thus  even  if  7,  is  diagonal,  suggesting  uncorrelated  un¬ 
constrained  estimator  errors,  the  rank-reduced  inverse 
Fisher  information  Q^Ji '  in  Theorem  1  can  have  off- 
diagonal  terms,  suggesting  correlated  constrained  estima¬ 
tor  errors. 

Remark  6.  A  result  of  Lemma  4  and  Theorem  1  is  that 
pure  inequality  constaints  W,  s  0  do  not  affect  the  CR 
bound  on  error  covariance  of  estimators  with  a  given 
mean  gradient  Vm,.  This  is  true  even  when  0  is  on  the 
boundary  of  this  set.  An  interpretation  of  this  fact  is 
obtained  by  recalling  that  the  Fisher  information  matrix 
7,  (12)  is  a  function  of  the  gradient  of  the  likelihood 
surface  at  0.  For  a  smooth  surface,  the  gradient  of  the 
surface  at  0  is  completely  determined  by  the  set  of 
directional  derivatives  along  directions  contained  in  a 
convex  cone  with  vertex  at  0,  e.g..  the  half-space  indicated 
in  Fig.  2.  In  the  case  of  one-dimensional  differentiable 
functions,  this  simply  reflects  the  equivalence  of  right  and 
left  derivatives.  Therefore,  the  restriction  of  allowable 


local  variations  of  a  parameter  at  the  boundary  ol  W,  <  o 
does  not  affect  the  CR  bound. 

Remark  7:  While  Theorem  1  is  stated  as  a  lower  bound 
on  the  estimator  error  covariance  matrix,  it  can  be  used  tv' 
specify  a  bound  on  the  mean-square  error  (mse)  matrix. 
£,1(0  -  0M0  -  0)')  Specifically,  since  the  mse  matrix 
equal  to  1,  -  ( m,  -  0  H  m,  -  0)'.  application  ot  the  theo¬ 
rem  cives  a  constrained  CR  bound  on  mse: 

£,{(  0-O)t0-O)^)>B  *(m,-0)(m,  -  0)'. 
where  B  is  given  by  (37). 

Remark  S:  Remarks  h  and  '  notwithstanding,  when  H 
corresponds  to  a  pure  inequality  constraint  Theorem  i 
does  not  imply  that  improvement  in  mse  is  impossible. 
Indeed  the  minimum-distance  projection  of  an  uncon¬ 
strained  estimator  0„  onto  (r)(  can  yield  an  estimator  with 
lower  mse  than  that  of  0„.  Such  an  estimator  arises  in  the 
example  studied  in  (18).  However,  if  the  estimators  differ 
the  projected  estimator  may  have  a  different  mean  from 
that  of  0„  which  generally  is  not  differentiable.  where..s 
Theorem  1  applies  to  classes  of  estimators  with  identical 
differentiable  means  m,. 

Remark  9:  In  the  course  of  proof  of  Theorem  1  it  was 
established  that  the  lower  bound  B,  (3b)  is  the  tightest 
bound  of  the  form  (14)  in  the  sense  that  B  = 
limsupi,  B^(0  *  4, (A,).  .0  -  vshere 

10-1-  I  are  k  arbitrary  sequences  converging  to  0 

along  paths  whose  projections  onto  the  tangent  plane 
are  k  linearly  independent  line  segments.  0  <  k  <  n  -  p 
For  linear  constraints  and  exponential  families  of  more 
can  be  proven:  B,  is  the  "limsup"  of  the  Barankin  bound 
B,  (7)' with  respect  to  arbitrary  sequences  of  test  points 
converging  to  0.  i.e..  B.  is  the  tightest  local  Barankin 
bound. 

111.  Applications 

In  this  section  we  illustrate  the  application  of  the 
constrained  CR  bound  (37)  by  specializing  to  the  cases  of 
linear  and  quadratic  constraints. 

Example  I)  Linearly  Constrained  Gauss-Markot  Prob¬ 
lem:  Let  £  be  an  m  X  n  matrix  of  rank  n.  i.  %m.  and 
suppose  that  one  observes  the  vector  X  £  S"'. 

X  =  F%  +  i\. 

where  0eR''.  TieR"  and  n  is  a  zero-mean  Gaussian 
vector  with  nonsingular  mum  covariance  matrix  K  = 
E-Itih^).  Since  the  model  is  linear  and  Gaussian,  the 

Jet 

Fisher  information  matrix  is  simply  calculated  as  7  =  7^  = 
F^K~'F.  which  is  independent  of  0.  Furthermore,  by  the 
Gauss-Markov  theorem  [21.  Ch.  4],  the  minimum  variance 
unbiased  (MVU)  estimator  0„  is  a  linear  function  of  .V. 

0„  =  7-'£^A:-'A'. 

The  error  covariance  of  0„  is 

i;»7''. 

Thus  0„  achieves  the  unconstrained  CR  bound.  (31)  of 
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Lemma  2.  for  unbiased  estimators.  (Recall  that  for  unbi¬ 
ased  estimators.  Vm,  =  /  ) 

Consider,  however,  tne  problem  of  estimating  6  subject 
to  the  k  linear  equalitv  constraints  G,  =  .46  =  0.  where  .-f 
IS  a  A  >  n  matrix,  k  <  n.  Using  the  fact  that  VGj  =  .4. 
Theorem  1  goes  the  constrained  CR  bound'  B,  = 
[Vflp|,](T/  '[Tot,]',  where 

Since  the  matrcx  Q  is  independent  of  6.  one  can  define 
the  estimator 

«  =  QJ-'F'K'X 

=  Qi. 


VG,  =  [/  -  1 J  it  is  easy  to  serifs: 

‘  -■'e  '(/-M' 

=  t;  -  '  -  ^  [f  -M' 

(4(1) 

where  /  =  T'J^T  and  T  is  an  orthogonal  matrix  ^uch 
that 


Due  to  the  constr-.int  .46  =  0.  0  is  unbiased 

f .( fi}  =  ( /  - n  •-< )» = « 

The  error  covariance  of  0  can  be  calculated  directly  from 
(45)  using  the  idempotence  of  Q: 

=  QQJ-' 

=  QJ-' 

=  B,, 

where  is  the  constrained  CR  bound.  (37)  of  Theorem 
I.  for  unbiased  estimation.  This  establishes  that;  1)  the 
estimator  6  of  (45)  is  the  MVU  constrained  estimator, 
and  2)  the  constrained  CR  bound  of  Theorem  1  is  achiev¬ 
able  for  the  Gaussian  linear  model  with  linear  constraints. 

Example  2)  Image  Reconstructton  with  a  Support  Con¬ 
straint:  Support  constraints  are  frequently  used  in  image 
reconstruction  problems  such  as  those  arising  in  tomo¬ 
graphic  imaging  [24].  [29]  and  phase  retrieval  [5],  [9]. 
Suppose  that  the  parameter  vector  of  interest  consists  of 
a  sampled  two-dimensional  image  that  is  represented  by  a 
complex-valued  vector  with  elements  = 

0. 1.  -  ■  A/  -  1.  We  will  represent  the  parameter  vector  0 
as  the  S-””  ”  vector 

a _  fijrr  qI  aft  of  aft  qI 

“  “  ['’hi  ll|•'’(ll  li'''llMl’  •'’(M-  l.M-ll'''(M-  I.M-uj  ’ 

where  the  superscripts  R  and  /  denote  respectively  the 
real  and  imaginary  parts  of 

If  the  support  of  the  object  is  known,  it  can  be  used  as 
a  constraint  in  the  estimation  of  6.  Let  5  be  the  support 
of  6. 

Let  Ij  denote  the  2M'  x2Af  ’  diagonal  matrix  with  [lj]„ 
=  I  if  the  ith  element  of  0  lies  inside  the  support  set  5 
and  [1,]„“0  otherwise,  i.e..  I,  is  the  matrix  indicator 
function  of  5.  The  support  constraint  then  has  the  form 
G,  =  [/ -ls]0  =  O.  From  Theorem  1  we  have  the  con¬ 
strained  CR  bound  B,  =  [ '[Tm,]'^.  Using 


where  O,  and  O-  are  zero  matrices.  In  other  words.  T  is 
a  translormation  that  rearranges  the  image  pixels  so  that 
the  support  is  in  the  upper  left  hand  corner  oi  the  image. 
Now  let  and  have  the  partitions 


y 


.4  B 
B^  C 
K  L 


(4b) 

(49) 


where  A  and  K  are  matrices  of  the  same  dimension  as 
the  identity  matrix  /  on  the  right-hand  side  of  (47),  With 
this  notation  [  /  -  1  J  "  ' '[  /  “  1  sT  is  the  partitioned  ma- 

].  where  (3,,  is  a  zero  matrix  of  the  appropri- 


trix 


o..  o 

of  M 


ate  dimensions.  Therefore  the  pseudo-inverse  on  the 
right-hand  side  of  (46)  is  simply  ** 


[of  vr 

the  rest  of  the  matrix  algebra  indicated  on  the  right-hand 
side  of  (46)  we  obtain 


Performine 


Q,Ji 


L.Vf-'L" 

0[ 


O, 

U. 


rr 


Using  identities  for  the  inverse  of  a  partitioned  matrix 
[11.  Theorem  8.2.1]  and  the  definitions  of  .4.B.C  and 
K.L.M.  (48)  and  (49).  the  matrix  K  -  LM''L^  can  be 
identified  as  the  inverse  of  the  block  matrix  A.  Hence. 


Q*Ji 


=  T 


A 

o\ 


O, 

O- 


7' 


/  o, 

A 

B 

1 

o, 

O- 

C 

O' 

O: 

“(IsAlvl*. 


(50) 


where  the  last  equality  follows  by  the  orthogonality  of  7. 
the  application  of  (47).  (48).  and  the  identification  7  7"  7' 
”  y,.  For  the  case  of  unbiased  estimation  Vm,  =  I  and 
(50)  is  the  constrained  CR  bound.  Comparing  the  con¬ 
strained  CR  bound  (50)  to  the  unconstrained  CR  bound 
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y,’ '  It  IS  evident  that  the  incorporation  of  support  con¬ 
straints  has  the  effect  of  zeroing  out  those  rows  and 
columns  of  the  Fisher  information  matrix  corresponding 
to  image  pixels  6  for  which  it  is  known  a  prion  that  the 
pixel  values  are  zero. 

It  IS  useful  to  compare  the  covariance  of  the  estimator 
errors  within  the  support  region  for  the  unconstrained 
cases.  Using  the  same  transformation  T  (47)  as  before,  we 
can  assume  without  loss  of  generality  that  the  support  is 
in  the  upper  left  corner  of  image,  i.e..  the  support  matrix 

indicator  function  is  U=  .  In  this  case  the  un¬ 

constrained  bound  within  the  support  region  is  ( .-4  - 
BC~  which  is  the  upper  left  block  element  of  the 

inverse  matrix  Jg'=  '  (48).  while  the  constrained 
CR  bound  for  these  pixels  is  .4  ‘ '  If  the  Fisher  matrix  is 
block  diagonal  then  fi  is  a  matrix  of  zeros  in  (48),  indicat- 
inc  that  the  errors  of  an  unbiased  efficient  estimator  of 
pixels  inside  and  outside  of  the  support  region  are  uncor¬ 
related:  in  this  case  the  constrained  CR  bound  is  identical 
to  the  unconstrained  CR  bound.  If  the  Fisher  matrix  is 
not  block  diagonal,  however,  there  may  be  substantial 
reduction  in  the  constrained  CR  bound  over  the  support 
region.  It  is  also  significant  that,  unless  7,  is  block  diago¬ 
nal,  setting  the  pixels  of  an  efficient  (CR  bound  achiev¬ 
ing)  unconstrained  estimator  to  zero  outside  the  image 
support  region  does  not  produce  an  estimator  that 
achieves  the  constrained  CR  bound.  This  is  in  contrast  to 
the  results  obtained  in  [5]  for  diagonal  7,. 

Example  3)  Spectrum  Estimation  with  Power  Constraints: 
When  there  is  prior  information  on  the  power  of  a  ran¬ 
dom  process  over  some  regions  of  frequency,  it  is  reason¬ 
able  to  expect  that  the  achieveable  error  covariance  of 
spectral  estimators  will  be  affected.  This  example  quanti¬ 
fies  the  effect  of  such  prior  information  on  the  con¬ 
strained  CR  bound. 

Let  I  A",),'.  I  be  a  segment  of  a  real  wide  sense  stationary 
random  process  with  power  spectral  density  (PSD) 
( V(  /)),  el  - 1  -  I  '1.  The  objective  is  to  estimate  the  PSD. 
e,  =  ■>’(/,).  at  n  distinct  frequencies  Let  the 

average  power  of  IX,)  be  known  over  P  nonoverlapping 
frequency  bands 

=  p^i.-.p.  (51) 

Sr 

where  5^  is  the  index  set  of  the  pth  frequency  band,  and 

is  the  known  average  power  of  {A",}  over  this  frequency 
band.  The  equations  (51)  correspond  to  P  linear  con¬ 
straints  on  the  unknown  PSD.  known  as  the  P-potnt 
constraint  in  robust  Wiener  filtering  theory  [20).  The 
concatenation  of  the  P  equalities  (51)  gives  the  P  equa¬ 
tions 

kf  ®i 

C,-  :  :  -  :  (52) 


where  Xp  is  an  n  x  I  column  vector  with  ith  element 


equal  to  1  if  /  e  5^  and  0  otherwise,  i.e..  Xr  's  the  vector 
indicator  function  of  S^.  The  gradient  matrix  TC,  is  given 
by  TC,  =[X|.'  •  .X/'!'-  resulting  in 


lx,  ■ 

■  Xi'] 

x'J.'xi 

X 1  Xp 

XrJ.'Xi  •• 

Xp-^»  '  Xr 

The  structure  of  0,7," '  is  considerably  simplified  when 
7,  is  the  diagonal  matrix; 

7,  =  diag,{er'). 

which  is  appropriate  for  the  case  of  Gaussian  observa¬ 
tions  (A",),' ,  and  large  Since  the  frequency  bands  |5,) 
are  nonoverlapping  the  pseudo-inverse  on  the  right-hand 
side  of  (53)  becomes  the  pseudo-inverse  of  a  diagonal 
matrix  and 

p  j:'xx^Jp' 

,  «  I  X/  ■'e  Xt 

Let  e,  =I0.'  '.0. 1.O,-  .0]'^  denote  the  /th  standard 

basis  vector  in  R".  Let  /  be  an  index  in  the  constraint  set 
5p.  Then  for  an  unbiased  estimator.  8.  the  constrained 
CR  bound  on  the  variance  of  the  /th  component.  6  .  is 
obtained  from  (54) 

lB,l,  =  e!B,e, 

",e,  X,V.--X,  1^' 


I  (/sSd  / 

Using  the  unconstrained  CR  bound  (B„]„  =  9;  =  >*■(/,). 
we  obtain  the  relative  reduction  in  the  CR  bound  due  to 
the  constraint 


Since  the  term  on  the  right  hand  side  of  (56)  is  between  0 
and  1.  the  average  power  constraint  induces  a  CR  bound 
reduction  on  the  component  PSD  estimation  errors.  The 
bound  reduction  factor  (56)  is  independent  of  the  other 
constraint  sets  S^.  k  =  .  P.  k  *  p.  and  therefore  av¬ 

erage  power  constraints  over  5^  do  not  affect  PSD  esti¬ 
mator  errors  at  frequencies  outside  of  S^.  The  amount  of 
bound  reduction  depends  on  two  factors:  1)  the  relative 
magnitude  of  the  spectral  component  of  interest.  .y'(  /, ). 
compared  to  the  magnitude  of  the  other  frequency  com¬ 
ponents  within  the  frequency  band  5^;  and  2)  the  length. 
I5pl  •  number  of  indices,  of  S^.  In  particular,  little  or  no 
reduction  in  the  variance  bound  occurs  for  the  case  where 
.y '(/,)/ o^‘(/,)  is  small  for  all  leS^.i*!.  However, 
when  IS  large  compared  to  the  other 
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/  e  5p.  a  substantial  reduction  in  the  bound  occurs.  This 
implies  that  the  most  bound  reduction  will  he  achieved 
over  those  constraint  regions  5  where  the  PSD  has  a  high 
dynamic  range,  i.e..  large  peaks.  The  particular  dynamic 
range  required  tor  a  significant  hound  reduction  is  pro¬ 
portional  to  S,. .  .As  a  rule  of  thumb,  for  a  reduction  in 
the  CR  bound  at  frequency  /  by  a  factor  a  or  more,  the 
ratio  of  rif.)  to  the  root  mean-squared  value  of  the 
remaining  spectral  components  in  5,.. 


—del 
>  = 


V  '  * ' 


must  satisfy 


■  rl  f.) 
— 


1  . 


Example  Siipial  Subspace  Constraints:  Signal  sub¬ 
space  constraints  are  used  in  sensor  array  processing 
estimation  problems  to  take  account  of  a  particular  struc¬ 
ture  of  the  array  covariance  matrix  [14],  Specifically,  as¬ 
sume  that  p  zero-mean  Gaussian  signals  arrive  at  differ¬ 
ent  angles  of  incidence  on  an  m-sensor  array  having  a 
zero-mean,  spatially  incoherent  array  noise  of  power  a~. 
Further,  assume  that  p  <m.  Then  the  covariance  matrix 
of  the  set  of  sensor  outputs  has  the  singular  value  decom¬ 
position 

^  =  E  cr'/  = 

(* I  (-1 

where  (r,}”  ,  are  the  eigenvectors  of  /i  and  (A,)”  ,  are  the 
eigenvalues: 


the  n  =  m-  element  parameter  vector  6  =  [A,,,.  . 

A|.p,'„.  -  .p[]'.  The  constraint  c)  can  be  then  be  ex¬ 
pressed  as  the  (m  -  p)x  n  matrix  constraint 


G.= 


m  -  p 


U'O 


where  l^  denotes  a  k  ''  k  identity  matrix.  O.  is  a  (m  -  pi 
xin  -  m  ~  p)  matrix  of  zero  entries,  and  1  is  a  t/n  -  p»- 
vector  of  ones. 

Observe  that  the  rows  of  TG,  arc  not  linearly  indepen¬ 
dent  due  to  the  fact  that  there  is  one  redundant  con¬ 
straint  in  c)  of  (57).  Observe  also  that  the  equality 
constraint  c)  creates  a  dimension  n  -  m  ~  p  *  \  linear 
subspace  in  the  unconstrained  parameter  space  Hence, 
despite  the  presence  of  inequality  constainis  a)  and  b). 
the  constrained  parameter  space  contains  no  regular 
points,  and.  by  Theorem  1.  the  constraints  a),  b)  do  not 
impact  the  form  of  the  constrained  CR  bound. 

As  in  Example  2.  partition  J,  according  to 


where  A  \s(m  -  p)xtm  -  p).B  is(m-p)x(n-/n  —  p). 
and  C  is  in-m-rp)xtn-m-i-p).  Then  the  n  x  n  in¬ 
verse  constrained  Fisher  matrix.  of  Theorem  1  is 

given  by 


Q,Ji '  =  Ji 


O, 

O- 


4'. 


where  O,  and  O-  are  zero  matrices  of  dimensions 
(m  -  p}xtn  -  m  +  p>  and  (n  -  m  ~  p)xin  -  m  ~  p).  re¬ 
spectively.  and  Z  is  the  (m  -  p)x(m  -  p)  matrix 


A  =  /'''  '  =  !•■  ■  -.P 

i  <7  ^  I  =  p  1 . •  •  ■ . m 

and  {a',),'!  I  denote  the  signal-dependent  eigenvalues  of  R. 
The  span  of  f,. •  •  -  .tp  is  called  the  signal  subspace. 

Consider  the  problem  of  estimating  the  eigenvalues  of 
R  when  p  is  known  but  all  of  the  other  parameters  are 
unknown.  This  partial  knowledge  induces  the  following 
constraints  on  the  A,: 


a) 

O 

A 

y  «=  1 ,  ■  •  • .  m 

1 

m 

b) 

A  >  - 

E 

A,. 

7  =  1 ,  •  •  • ,  m 

m  -p  , 

~  p  * 

( 

1 

c) 

- 

E 

A,  =  0, 

7  =  p -t- 1.- ■ -.m  (57) 

m  -  p  ^ 

-  r  ^ 

1 

z  =  ^c,y,-'[Tc,]" 

1 


An  -  p 


m  -  p 


ir 


[a  - 


r 


m  -  p 


(58) 


As  a  simple  example,  consider  the  case  where  the 
Fisher  information  matrix  is  block  diagonal  with.  B  =  O, 
and  (4  =  a/ -7^.^.  Then  Z  =  -  7:7^11^.  Using 

condition  3)  of  (9)  it  is  easy  to  show  that  Z* 

-  This  results  in 


Q»Ji' 


,  1 
- 

m  -  p 


ir 


o 


T 

I 


■ 

O, 

C"' 


(59) 


where  constraint  a)  arises  from  the  assumed  positive-defi¬ 
niteness  of  R.  constraint  b)  takes  account  of  the  positivity 
of  the  signal  eigenvalues  {A’,},”. ,.  and  constraint  c)  reflects 
the  equality  of  the  m  -  p  noise  eigenvalues. 

Let  each  unknown  eigenvector  r,  e  R"  be  parameter¬ 
ized  by  its  m  -  1  direction  cosines,  p,  -  [p,  -  •  -  .p,  ,1^ 

/  =  1.  -  •  -  .m.  The  combination  of  the  m  unknown  eigen¬ 
values  and  the  m(m  -  1)  unknown  direction  cosines  yields 


Suppose  there  exists  an  efficient  unbiased  estimator  8„ 
for  the  eigenvalues  and  eigenvectors  which  satisfies  con¬ 
straints  a)  and  b),  and  assume  that  the  Fisher  information 
is  block  diagonal  as  previously  specified.  The  right-hand 
side  of  (59)  is  then  the  covariance  matrix  of  the  estimator 
obtained  by  replacing  each  of  the  m  -  p  noise  eigenvalue 
estimates  in  0„  by  their  average  — Hence,  if 
an  efficient  unconstrained  estimator  of  the  eigenvalues 
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can  be  found  that  has  positive  elements,  the  estimator 
obtained  hs  averaging  over  the  /n  -  p  smallest  eigenval¬ 
ues  of  the  efficient  estimator  achieves  the  constrained  CR 
hound. 

Example  Smnal  Esnmaliitn  ivilh  Pi  war  Coii'urttmis: 
Consider  the  problem  ol  estimating  the  discrete-time  sig¬ 
nal  waveform,  ff,.  .ff...  subieci  to  cc-nstraints  on  the 
squared-modulus  ot  the  DFT  of  6.  Here,  the  sum  of  the 
squared  moduli  over  each  of  P  nonoverlapping  trequenev 
intervals  is  constrained  to  be  equal  to  known  constants 
E^.  p=\.  .P.  While  similar  to  the  case  studied  in 

Example  ?.  this  problem  involves  nonlinear  quadratic 
constraints  on  the  parameters,  and  time  rather  than  fre- 
quenev  domain  estimation  is  performed. 

Let  If  =  [H  |.  ■ .  H;,1  denote  the  n  x  n  unitarv  matrix  of 

orthonormal  discrete  Fourier  transform  columns;  M,  = 

1 /v.V(I.t>--"  -.c  /=l. Now 

suppose  that  for  P  <  n  the  constraint  takes  the  form 

r  ;[ne],  '  =  p={.Z.  ■  .P.  (6(1) 

I  -5  S 

Here.  5^  denotes  the  index  set  of  the  pth  interval  and 
[«'0],  is  (th  component  of  the  n-point  DFT  of  9.  When 
P  =  n.  (60)  specifies  the  modulus  Fourier  transtorm  of  6. 
As  in  Example  2.  we  let  denote  the  «  x  n  dtagonal 
matrex  with  !l^]„  =  1  if  i  £ and  [1^]„=0  otherwise. 
Then  the  constraint  (60)  can  be  written  as  the  set  of  P 
equations 


O' 

- 

It 

.0. 

£,(:(li  e)  -  tjn  e],  "I.  is  obtained  bv  evaluating  the 
quadratic  forms  'lie  and  e' H  He 

[Bl  1 

-  =  1 - ^  .  I  h.'  ) 

[B..I  ^  - 

-:v“.  [»e]- 

This  is  of  identical  form  to  the  expression  obtained  lor 
constrained  PSD  estimation.  (.■'5)  of  Example  when  the 
power  spectral  density,  ri  t  ).  is  identified  with  the  mag¬ 
nitude  spectrum  i( H  6], .  i  =  I .  .n.  For  unbiased  estima¬ 

tors.  a  bound  on  the  total  mean-squared  error  in  estimat¬ 
ing  the  time  domain  signal  ft  's  can  be  determined  from 
(63)  by  using  the  unitary  property  of  the  DFT  matrix  It 
(Parseval's  Theorem); 

»t 


Slr{C?,7. '} 

=  o”  tr  1 H  " 

^  i  itee'it  "! 
'  ,-i  e'H'Tite 

=  tr-  tr  ^  /  - 

1  H  ee'^M  "1  1 

y 

1  e'n'T,ne  1 

=  trnn-P] 

Therefore,  on  the  average,  the  constraints  produce  a 
factor  of  l-P/n  reduction  in  the  CR  bound  on  the 
variances  of  unbiased  estimators  of  the  0,  s. 

IV.  Concllsion 


where  the  superscript  H  denotes  hermitian  transpose. 
The  gradient  VG,  is  the  P  x  n  matrix 


(61) 


We  now  specialize  to  the  linear  observation  model; 


A,  =  S,  -*■  rj, ,  /  =  1 .  •  •  ■ , « 


where  p,  is  a  zero-mean  Gaussian  white  noise  with  vari¬ 
ance  trv  Recalling  Example  1.  7,  can  be  seen  to  be  the 
scaled  identity  matrix  a'-E  Let  O  denote  the  n  x  n  zero 
matrix.  Using  (61)  and  the  fact  that  the  intervals  are 
nonoverlapping  1,1,  =  0.  i  *  j.  the  inverse  constrained 
Fisher  matrtx  of  Theorem  1  is  the  n  x  n  matrix 


^  \,wWw^\ 
j  -  y  — - 1 


W.  (62) 


Since  ff'  is  the  (linear)  DFT  operator,  the  matrix 
L(  )]  on  the  right-hand  side  of  (62)  is  the  inverse  con¬ 
strained  Fisher  information  matrix  for  estimation  of  the 
DFT  W'6.  As  in  Example  3.  let  the  index  /  be  constrained 
in  5^.  Then  the  ratio  between  the  constrained  and  uncon¬ 
strained  CR  bounds  on  the  variance,  vardff®],) - 


A  constrained  Cramer-Rao  (CR)  lower  bound  on  the 
error  covariance  of  estimators  of  multidimensional  pa¬ 
rameters  has  been  obtained.  The  constrained  CR  bound 
was  derived  from  a  limiting  form  of  a  multiparameter 
Barankin-type  bound.  For  constraint  sets  defined  by  a 
general  smooth  functional  inequality  constraint  of  the 
form  <  0.  the  constrained  CR  bound  is  equivalent  to 
the  unconstrained  CR  bound  evaluated  with  a  "con¬ 
strained"  Fisher  information  matrix.  This  constrained 
Fisher  matrix  was  shown  to  be  identical  to  the  classical 
unconstrained  Fisher  matrix  at  all  regular  points  of  the 
constraint  set.  e.g..  at  interior  points.  However  at  nonreg- 
uiar  points,  such  as  points  governed  by  equality  con¬ 
straints.  the  constrained  Fisher  matrix  is  a  rank-deficient 
maf*x.  This  constrained  Fisher  matrix  is  equivalent  to  a 
matrix  of  orthogonal  projections  of  the  rows  and  columns 
of  the  unconstrained  Fisher  matrix  onto  the  tangent  hy¬ 
perplanes  of  the  constraint  set.  The  simple  form  of  the 
constrained  CR  bound  allows  the  effect  of  particular 
equality  and  inequality  constraints  to  be  easily  studied 
through  comparisons  between  the  constrained  and  uncon¬ 
strained  CR  bounds.  It  was  shown  that  the  incorporation 
of  functional  constraints  necessarily  decreases  the  CR 
bound  for  unbiased  estimators.  Not  surprisingly,  the  con¬ 
strained  bound  was  shown  to  be  achievable  for  the  lin- 
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early-constrained  Gauss-Markov  problem.  To  illustrate 
the  application  of  the  constrained  CR  bound,  several 
applications  in  the  area  of  signal  processing  were  consid¬ 
ered.  These  included  support  constraints  in  image  recon¬ 
struction.  signal  subspace  constraints  in  array  processing, 
and  average  power  constraints  in  spectral  estimation  and 
in  signal  estimation. 

In  their  present  form,  the  results  obtained  in  this  paper 
only  directly  apply  to  a  finite  dimensional  parameter 
space  and  a  non-stochastic  constraint.  A  generalization  of 
these  results  to  infinite  dimensional  parameter  spaces 
would  be  useful  for  the  study  of  constraints  in  filtering, 
prediction,  and  smoothing  problems.  Theorem  1  could 
perhaps  be  applied  to  complete  separable  infinite-dimen¬ 
sional  parameter  spaces,  e.g..  a  separable  Hilbert  space, 
by  taking  the  formal  limit  of  the  elements  of  the  matrix 
bound  (37)  as  the  dimension  of  the  indicated  matrices 
goes  to  infinity.  Stochastic  constraints  are  of  interest 
when  the  constraint  depends  on  the  particular  realization 
of  the  statistical  experiment,  and  they  provide  a  model  for 
partially-known  constraints.  A  main  difficulty  in  obtaining 
a  generalization  of  the  constrained  CR  bound  to  differen¬ 
tiable  stochastic  constraints  is  that  the  column  space  of 
the  constraint  equality  gradient  matrix.  V./,,  is  in  general 
a  random  set  and  therefore  Lemma  2  cannot  be  applied. 
On  the  other  hand,  a  tractible  analysis  may  be  possible 
for  simple  stochastic  constraints  such  as  constraints  ob¬ 
tained  from  random  perturbations  of  the  constraint  func¬ 
tion  Jg. 
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Appendix 

Lemma  5:  Let  Q  be  an  arbitrary  n  x  m  matrix  and  T  be  any 
m  X  m  invertible  matrix.  Then 

QT[T^Q^QT]' ~  Q[Q^Qy  Q\  (M) 

where  the  plus  sign  denotes  (Moore-Penrose)  pseudo-inverse. 
As  a  consequence,  if  R  is  an  arbitrary  mx  n  matrix.  J  is  an 
m  X  m  positive  definite  matrix,  and  T  is  an  invertible  nx  n 
matrix,  then: 

RTlT^R^JRT]'  T^R^~  RlR^JR]'  R^.  (65) 

Proof  of  Lemma  5:  Let  the  left  and  right  sides  of  the 
identiiy  (64)  be  denoted  as  the  n  x  n  matrices  P,  and  P~. 
respectively.  It  is  easily  verified  that  P,  and  P-  are  symmetric 
and  idempoient.  Therefore  P,  and  P.  are  orthogonal  projec¬ 
tions  onto  respective  subsets,  u#',  and  say.  of  R"'  [22. 
Section  105).  Furthermore,  using  properties  l)-3)  of  (9).  it  is 
easily  verified  that  P;P|P:  -  P;  and  P|P;P|  -  P,.  Equivalently, 
since  P,  P;  and  P,  P,  are  projections  onto  respective  subsets  of 
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P.P  =  P,  jnd  P  P  =  P-  However.  PP  =  P  im- 
pliev  P  >  P  [22.  Prop  d  ot  Section  lo-ll.  jnd  hence  /’  =  P- 
To  chow  (h.^i.  lirsi  observe  ihjt.  due  to  positive  detiniicnc" 
there  exists  an  invertible  matriv  7  ■  such  that  7  '7  ‘  Deliiie 

(J  =  J‘  R  Then  th.s I  reads 

7  ^  'QT[T'Q'QT]  T'<J’J  '  '■ 

=  J  (.'  7  ' 

wkhich  lolltms  dirccils  Irom  Thj'v  tJn^shc^  the  pri'ol  ft 

Lemma  5  - 
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APPENDIX  C 

REFLECTION  BY  AN  ILLUMINATED  CYLINDER 

Figure  C-1  depicts  a  cross-section  of  a  cylinder  illuminated  from 
an  angle  below  the  horizon  and  viewed  from  an  angle  below  the 
horizon.  For  simplicity  we  assume  that  the  sun  illuminates  it  at 
broadside  and  we  are  viewing  it  from  broadside.  Let  6^  be  the 
clockwise  angle  between  the  viewing  angle  and  the  surface  normal  at  a 
given  point  on  the  surface,  and  let  6^  be  the  counterclockwise  angle 
between  the  illumination  angle  and  the  surface  normal  at  a  given  point. 
Then 


(C-1) 


Let  the  diameter  of  the  cylinder  be  d^.  The  distance  from  a  given 

point  on  the  surface  and  the  center  of  the  cylinder,  projected  along 

the  perpendicular  to  the  viewer's  line-of-sight  is 

Xp  =  (d^/Z)  sin  .  (C-2) 

The  illuminated  part  of  the  cylinder  seen  by  the  viewer  goes  from  = 

r/2,  at  the  edge  of  the  cylinder  as  seen  by  the  viewer,  where 

(^0  =  ir/2)  =  t/2  -  -  *0  (C-3) 

and 

=  w/Z)  =  dJZ  (C-4) 

to  the  edge  of  the  shadow  (at  6^  =  t/2),  where 


6^{B^  =  ir/2)  =  tiZ  -  -  *0 


(C-5) 
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Figure  C-1.  Ill umi nation  and  Viewing  Geometry  of  a  Cylinder  (Axis 
Normal  to  the  Plane  of  the  Page). 
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and 

Xp(fli  =  r/2)  =  (d^/Z)  cos(#^  +  ♦q)  .  (C-6) 

(Note  that  a  negative  value  of  Xp  would  indicate  a  point 
counterclockwise  from  the  viewing  angle.)  The  illuminated  width  of  the 
cylinder  from  the  viewing  perspective  is 

dp  =  =  r/2)  -  Xp(fl^  =  r/2)  =  (dQ/2)  [1  -  cos{t^  +  t^)]  .  (C-7) 

The  angles  over  which  light  is  scattered  toward  the  viewing  angle  are 

r/2  -  ^  i  r/2  (C-8a) 

and 

r/2  i  6^  i  r/2  -  .  (C-8b) 

Consider  a  reflecting  area 

AA  =  Ay(dj,/2)  (C-9) 

where  Ay  is  the  length  of  the  area  along  the  axis  perpendicular  to  the 
plane  of  Figure  C-1.  For  a  Lambertian  surface  of  reflectivity  r^,  the 
energy  density  scattered  into  a  solid  angle  dO^  is 

*^o  *  ^i  ®o  *^“0 

=  (r^  dQ/2r)  Ay  cos  cos  0^  d^^  dO^  (C-10) 

[where  Eq.  (C-8)  is  valid],  where  E^  is  the  incident  energy  density. 

The  apparent  spatial  brightness  distribution  of  the  object  depends 
on  the  projection  of  this  area  onto  the  plane  perpendicular  to  the 
line-of-sight,  where  the  projected  area  is 
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AAp  =  AA  cos  .  (C-11) 

This  comes  from  the  fact  that  from  Eq.  (C-2) 

^  ®o  ^^0  • 

Consequently  the  projected  energy  is 

^^op  *  ^i  ®i  ^^p  ‘^“o  • 

From  Eqs.  (C-1)  and  (C-2), 
cos  =  cos(ir  -  -  ♦q  -  5q) 

»  sin(»^  *0  ■  *  ^0^  (C-14a) 

=  -cos(#^  +  ♦q)  cos  +  sln(*^  +  ♦jj)  sin  (C-14b) 


=  -cos(#^  +  Jl  -  (2Xp/dQ)^  +  sin(»^  +  (2Xp/djj) . (C-14c) 

Eqs.  (C-13)  and  (C-14c)  give  the  apparent  brightness  as  a  function  of 
the  viewed  coordinate,  Xp.  Figure  C-2  shows  Eq.  (C-13)  plotted  as  a 
function  of  Xp  for  (t^  +  t^)  *  20*  to  180*  in  20*  increments.  [Note 
that  the  apparently  continuous  curves  at  Xp  =  1  are  pairs  of  curves 
that  approach  Xp  =  1  with  the  same  values  and  slopes,  one  of  the  pair 
of  curves  for  +  t^)  and  the  other  for  (180*  -  -  ♦q)*] 

Now  consider  the  total  energy  density  arriving  at  a  detector.  This 
can  be  obtained  by  a  Integrating  Eq.  (C-13)  over  Xp  or  by  integrating 
Eq.  (C-10),  using  Eq.  (C-14b),  over  d^^^.  The  latter  is  given  by 
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Figure  C-2.  Relative  Brightness  (Intensity)  Across  the  Projected  Image 
of  the  Cylinder,  for  +  ♦  )  =  20*  to  180*  in  20* 
Increments. 
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L  r/2 

j  Ay  J  (d^/2)  AE^ 

0  ir/2-#^-*Q 

w/2 

=  L(rQ  dQ/2r)  dQ^  j  [-cos(*.  +  cos  0^ 

r/2-i^-to 

+  slnC#^  +  *^)  sin  5^^]  cos  6^  d0^ 
=  Kr^d^Mr)  E^[s1n(«^  +  -  («^  +  cos(#^  +  t^)]  dll^ 

s  L(r^d^/2r)  E^  V(#^  +  9^)  dO^  {C-15) 

where  dfl^  is  the  angular  subtense  of  a  detector  as  viewed  from  the 
target.  The  function 

V(#^  +  #0)  =  (1/2)  [s1n(#^  +  t^)  -  (t^  +  t^)  cos(»^  +  tj,)]  (C-16) 

is  shown  In  Figure  C-3,  plotted  as  a  function  of  (t^  +  (in 
degrees) . 

Example 

Suppose  that  =  10*  and  9^  =  55*  so  that  (t^  +  9^)  -  65*.  Then 
the  illuminated  region  can  be  seen  for  25*  ^  $  90*,  for  which  90*  I 

0.  >  25*.  The  relative  perceived  reflectivity,  given  by  Eqs.  (13)  and 
(14)  Is  proportional  to  cos  0^,  which  varies  from  0  to  cos  25*  =  0.906, 
following  a  curve  slightly  above  the  60*  curve  shown  In  Figure  C-2. 
The  perceived  width  of  the  cylinder  is  dp  =  (^q/Z)  (1  -  cos  65*)  = 

0.577  {dJ2)‘,  so  for  a  0.8m  diameter  cylinder,  the  perceived  width 
would  be  0.231m.  From  Eq.  (C-16),  V(65*)  =  0.213  (as  compared  with  the 
maximum  possible  value  of  r/2). 
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Figure  C-3.  Relative  Energy  Density  Arriving  at  an  Aperture-Plane 
Detector  as  a  Function  of  +  tQ)  (in  Degrees). 
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IMAGE  RECONSTRUCTION  FOR  AN  ABERRATED  AMPLITUDE 
INTERFEROMETER  WITH  A  PARTIALLY-FILLED  APERTURE 


J.  R.  Fienup  and  J.  D.  Gorman 

Optical  Science  Laboratory 
Environmental  Research  Institute  of  Michigan 
P.O.  Box  8618.  Aim  Arbor,  Michigan  48107-8618.  USA 


1.  Introduction 

Measurements  obtained  with  an  aperture-plane  amplitude  interferometer  [1.2]  utilizing  a 
180®  rotational  shear  through  a  telescope  having  a  partizdly-fiUed  aperture  can  have  missing 
spatial  frequency  bands  corresponding  to  the  aperture-plane  regions  where  there  is  no  aperture 
fill.  The  system  transfer  function  in  this  case  is  a  scaled  version  of  the  telescope  aperture  func¬ 
tion.  and  the  missing  Fourier-domain  data  causes  the  resulting  images  to  be  highly  distorted 
[Figures  1(e)  and  1(f)  for  example].  This  is  in  contrast  to  conventional  focal-plane  imaging 
systems  where  the  system  transfer  function  is  the  autocorrelation  of  the  telescope  aperture 
function,  in  which  case  Wiener  filtering  can  often  be  used  to  level  the  transfer  function.  A 
further  complication  arising  in  the  image  formation  process  is  that  for  realistic  imaging  sys¬ 
tems.  the  phase  of  the  Fourier  data  can  be  corrupted  or  completely  lost  in  the  presence  of 
atmospheric  turbulence  or  optical  aberrations.  Thus  there  are  two  difficulties  which  compli¬ 
cate  the  reconstruction  of  images  from  aperture-plane  amplitude  interferometer  measurements: 
the  absence  of  particular  spatial  frequency  bands  and  the  possible  corruption  of  the  phase  of 
the  data.  This  paper  examines  an  application  of  the  iterative  Fourier  transform  algorithm 
[3.4.5]  to  the  problem  of  reconstructing  missing  Fourier-domain  information  from  aberrated 
aperture-plane  amplitude  interferometer  measurements  to  obtain  diffraction-limited  imagery 
corresponding  to  a  filled  aperture. 

Common  examples  of  collection  systems  having  partially-filled  apertures  are  telescopes  with 
a  central  obscuration,  for  which  the  low  and  middle  spatial-frequency  bands  are  blocked  by 
the  secondary  mirror;  and  segmented  or  multiple-mirror  telescopes,  for  which  certain  middle 
aind  high  spatial-frequency  bands  are  lost.  Two  types  of  aperture  functions  were  considered 
in  this  study;  tin  annular  aperture  which  will  be  denoted  as  aperture  .A.  and  a  segmented 
aperture  consisting  of  a  hexagonal  arrangement  of  seven  smaller  circular  apertures,  which  shall 
denoted  as  aperture  H.  Figure  1(d)  shows  the  original  object  used  in  the  simulations.  Its 
Fourier  transform,  the  magnitude  of  which  is  shown  in  Figure  1(a),  was  multiplied  by  aperture 
A  to  obtain  the  aperture  plane  data  of  Figure  1(b)  and  corresponding  image,  Figtire  lie).  The 
transform  was  also  multiplied  by  aperture  H  to  obtain  the  aperture  plane  datu  of  Figure  lie) 
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and  corresponding  image.  Figure  Iff).  The  dynamic  range  of  the  Fourier  magmtude  data  m 
Figures  l(a-c),  3(a-c)  and  4(a-c)  is  quite  large,  so  the  square  root  of  the  Fourier  magnitude  is 
displayed. 

Three  scenarios  were  investigated:  (i)  the  Fourier  magnitude  is  measured  over  a  filled 
aperture,  (ii)  the  Fourier  magnitude  and  phase  are  measured  over  a  partial  aperture,  and 
(iii)  the  Fourier  magnitude  is  measured  over  a  partial  aperture  and  no  phase  information  is 
measured.  The  iterative  Fourier  transform  is  used  to  reconstruct  the  missing  data  for  all  these 
cases. 

Case  (i)  corresponds  to  the  situation  in  which  the  Fourier  magnitude  is  known  over  an 
entire  filled  aperture.  Here,  the  image  reconstruction  problem  is  equivalent  to  reconstructing 
the  Fourier  phase  over  the  aperture  and  the  problem  is  that  of  phase  retrieval.  The  iterative 
tr2insform  algorithm  is  robust  in  this  case.  Examples  of  such  reconstructions  will  not  be  given 
here  since  they  can  be  found  in  References  [3,4,5, 6],  including  the  case  where  large  amoimts  of 
noise  are  present  [6].  Case  (ii)  corresponds  to  the  situation  where  there  are  no  aberrations,  but 
the  complex  Fourier  data  is  incomplete  due  to  missing  frequency  bands.  The  reconstruction 
of  an  image  from  such  data  requires  that  the  Fourier  magnitude  and  phase  be  reconstructed 
within  the  missing  frequency  bands  to  obtain  an  estimate  of  a  filled  aperture  plane.  Hence  the 
problem  is  equivalent  to  that  of  interpolation.  The  iterative  algorithm  is  used  to  interpolate 
the  missing  spatial  frequency  bands.  One  could  also  consider  extrapolating  the  Fourier  domain 
data  out  to  higher  spatial  frequencies;  however  this  problem  is  known  to  be  very  ill-posed  and 
it  is  not  considered  here.  Case  (iii)  corresponds  to  the  situation  in  which  there  is  no  phase 
information  at  all  and  the  Fourier  magnitude  is  known  only  over  a  partial  aperture.  Here 
the  image  reconstruction  problem  requires  both  phase  retrieval  and  interpolation.  This  case 
is  perhaps  the  most  realistic  setting,  in  which  aberrated  measurements  are  taken  through  a 
telescope  with  a  partially-filled  aperture.  Unfortunately,  out  of  the  three  cases  investigated  it 
also  poses  the  most  difficult  reconstruction  problem. 

In  the  following  discussion,  reconstruction  extunples  for  cases  (ii)  and  (iii)  will  be  described. 
In  each  of  these  cases,  the  iterative  Fourier  transform  algorithm  was  applied,  each  iteration 
consisting  of  the  following  four  steps,  as  illustrated  in  Figure  2: 

1 .  The  current  image  estimate  is  Fourier  transformed  to  produce  an  estimate  of  the  object  s 
Fourier  transform  over  the  entire  Fourier  domain. 

2.  Fourier-domain  constraints  corresponding  to  the  measured  data  are  satisfied  by:  lai 
replacing  the  magnitude  and  phase  of  the  current  estimate  with  the  measured  magnitude 
and  phase  within  the  region  of  the  aperture  plane  corresponding  to  the  telescope  aperture 
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function  [note  that  in  cases  (i)  and  (iii)  the  phase  is  not  measured  and  only  the  magnitude 
is  replaced],  (b)  leaving  the  Fourier  transform  imaltered  over  the  missing  frequency  bands 
within  the  filled  aperture,  and  (c)  setting  the  Fourier  transform  to  zero  outside  the  filled 
aperture. 

3.  The  result  is  inverse  Fourier  transformed. 

4.  The  object-domain  constraints  of  positivity  and  object  support  are  satisfied  using  one  of 
two  methods:  Error  Reduction  (ER),  which  is  a  Gerchberg-type  algorithm  [7]  or  Hybrid 
Input /Output  (HIO)  [3,4.5] . 


The  object-domain  support  constraint  is  determined  from  the  measured  data  in  one  of  two 
ways.  If  the  phase  is  known  over  part  of  the  Fourier  domain,  then  one  can  form  a  degraded 
image  from  the  partial  Fourier  magnitude  and  phase  data.  An  initial  support  constraint  can 
then  be  formed  by  thresholding  the  magnitude  of  the  degraded  image.  To  minimize  the  ringing 
effects  due  to  the  partial  fill  of  the  aperture,  it  is  necessary  to  first  apply  a  weighting  func¬ 
tion  to  the  Fourier  magnitude  data.  If  there  is  no  measured  phase,  then  an  object-domain 
support  is  determined  from  the  Fourier  magnitude  as  follows.  The  magnitude  is  squared  and 
inverse  Fourier  transformed  to  obtain  the  autocorrelation  of  the  object.  Again,  weighting  of  the 
Fourier-domain  squared  magnitude  may  be  necessary  to  avoid  excessive  ringing  in  the  autocor¬ 
relation.  The  autocorrelation  is  then  thresholded  to  obtain  an  estimate  of  the  autocorrelation 
support.  .\n  initial  estimate  of  the  object  support  is  then  obtained  from  the  autocorrelation 
support  by  using  a  triple- intersection  rule  [8,9]..  For  future  reference,  the  object  support  es¬ 
timate  determined  according  to  this  rule  will  be  called  the  triple^mtersection  support.  It  is 
important  to  note  that  in  both  cases  the  object  support  estimates  described  above  rely  on 
thresholded  values  and  thus  may  exclude  parts  of  the  actual  object.  Hence  as  the  iterations 
progress,  the  support  constraint  is  enlarged  by  including  neighboring  pixels,  thus  ensuring  that 
the  whole  object  is  eventually  contained  within  the  support  constraint. 

2.  Case  (ii),  Partial  Fourier  Magnitude  and  Phase 

Figures  3  and  4  show  examples  of  the  iterative  transform  algorithm  applied  to  the  problem 
of  interpolation.  The  measured  data  was  assiuned  to  consist  of  the  Fourier  magnitude  tind 
phase  over  a  partial  aperture.  Figure  3fb)  shows  the  simulation  of  measurements  over  aper¬ 
ture  A.  for  which  the  Founer  data  over  a  central  disk  1/3  the  diameter  of  the  filled  aperture 
was  blocked.  Hence  the  ratio  of  the  area  of  the  blocked  region  to  the  entire  filled  aperture 
was  1/9.  Figure  4(b)  shows  "he  data  corresponding  to  aperture  H,  for  which  Fourier  data  was 
only  collected  over  seven  small  circular  subapertures.  The  ratio  of  the  area  where  there  was 
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no  Fourier  information  to  the  area  of  a  filled  aperture  encompassing  aperture  H  was  1/2.  The 
images  corresponding  to  the  data  collected  in  apertures  top  A  and  H  are  shown  in  Figures  3(e) 
and  4(e)  respectively.  These  images  were  used  as  the  initial  object  estimates  for  the  iterative 
transform  algorithm. 

For  the  case  of  aperture  A.  Figtire  3(g)  shows  the  initial  support  constraint,  which  was  a 
thresholded  version  of  the  degraded  image  shown  in  Figure  3(e).  Enlarged  support  constraints 
which  were  used  as  the  iterations  progressed  are  shown  in  Figures  3(h)  and  3(i).  With  the 
support  constraint  of  Figure  3(i)  in  place,  the  algorithm  converged  quite  quickly  to  a  solution 
consistent  with  the  support  constraint  and  the  measured  Fourier  data,  yet  it  did  not  converge 
to  the  true  solution.  The  resulting  reconstructed  image,  shown  in  Figtire  3(f),  still  has  some 
distortion;  nevertheless  it  appears  to  be  better  than  the  initial  estimate,  shown  in  Figure  3(e). 
Similar  reconstruction  results  were  obtained  for  case  of  aperture  H  and  are  shown  in  Figure  4. 

An  examination  of  the  Fourier  magnitude  of  the  reconstructed  image,  shown  in  Figure 
3(c),  indicates  that  part  of  the  problem  in  the  reconstruction  may  be  that  the  magnitude  in 
the  interpolated  region  of  the  Fourier  plane  is  underestimated.  Figure  5  shows  a  plot  of  cuts 
through  the  filled- aperture  Fourier  magnitude  and  the  interpolated  Fourier  magnitude.  Over 
the  blocked  central  region,  the  peaks  of  the  estimated  Fourier  magnitude  appear  to  be  in  the 
right  place  but  they  are  smaller  and  show  less  contrast  than  the  true  Fourier  magnitude. 

Thus,  for  the  case  of  interpolation  only,  the  algorithm  converged  quickly,  but  the  recon¬ 
structed  image  was  of  mediocre  quality.  The  fast  convergence  is  due  to  the  fact  that  the 
constraints  in  each  domain  form  a  convex  set.  The  ER  algorithm  for  this  case  is  a  projection 
onto  convex  sets  (POCS)  algorithm.  POCS  algorithms  are  known  to  have  strong  convergence 
properties  [10].  However,  the  poor  quality  of  the  reconstructed  images,  despite  the  absence  of 
noise  in  the  measurements,  can  be  an  indication  that  the  interpolation  problem  is  ill-posed. 

3.  Case  (iii),  Partial  Fourier  Magnitude  and  No  Phase 

Figure  6  shows  an  example  of  the  iterative  transform  algorithm  applied  to  the  problem 
of  simultaneous  phase  retrieval  and  interpolation.  In  this  case,  an  aberrated  aperture- plane 
measurement  was  simulated  for  a  centrally-blocked  aperture  in  which  the  central  obscuration 
was  a  circle  with  l/8th  the  diameter  of  the  filled  aperture.  The  phase  was  atssumed  to  be  too 
corrupted  to  be  useful,  so  that  the  only  input  data  to  the  algorithm  was  the  Fourier  magnitude 
over  a  partial  aperture  having  an  axmular  shape. 


D-5 


Figure  6(  a )  shows  the  original  object  used  in  the  simulation.  For  reference.  Figure  6(  b  i 
shows  the  image  corresponding  to  error-free  magnitude  and  phase  measurements  over  the 
centrally-obscured  aperture.  This  image  was  assumed  to  be  unavailable  since  the  Fourier 
phase  is  unknown.  The  initizd  triple-intersection  object  support  constraint  computed  from  the 
given  Fourier  magnitude  is  shown  in  Figure  6(d).  Enlarged  versions  of  the  support  constraint 
are  shown  in  Figures  6(e)  and  6(f).  The  initial  estimate  for  the  object  was  obtained  by  filling 
the  support  shown  in  Figure  6(d)  with  uniformly  distributed  random  numbers.  A  partially- 
reconstructed  image  was  obtained  from  the  partial  Fourier  magnitude  data  using  the  support 
constraints  shown  in  Figtues  6(d-f).  The  algorithm  was  then  rerun  using  a  different  sequence 
of  random  numbers,  yielding  a  second  partially- reconstructed  image.  Two  more  partially- 
reconstructed  images  were  obtained  similarly,  using  a  second  initial  support  constrmnt.  This 
second  support  constrtiint  was  generated  by  appijdng  a  triple-intersection  rule  to  an  autocor¬ 
relation  support  computed  with  a  different  threshold  value.  The  four  partially-reconstructed 
images  then  were  combined  to  form  a  composite  image  by  using  the  stripe-removal  methods 
described  in  Reference  [5].  The  resulting  reconstructed  image,  shown  in  Figure  6(c).  still  has 
some  stripe  artifacts  but  is  otherwise  a  faithful  representation  of  the  true  object.  The  ex¬ 
periment  was  repeated  with  much  larger  central  obscurations  but  the  quality  of  the  resulting 
reconstructed  images  was  sigmficantly  degraded. 

4.  Conclusions 

In  practical  optical  systems,  the  measurements  made  in  aperture-plane  amplitude  intener- 
ometry  can  have  missing  spatial  frequency  bands.  Moreover,  the  phase  of  these  measurements 
can  be  corrupted  by  atmospheric  turbulence  or  aberrations  present  in  the  optical  system.  The 
reconstruction  of  an  extended  object  from  these  meastirements  thus  involves  the  interpolation 
of  the  missing  frequency  bands  and  the  retrieval  of  the  missing  or  aberrated  phase.  In  this 
paper  we  demonstrated  that  the  iterative  transform  aJgorithm  can  be  used  for  phase  retrieval 
or  interpolation  or  both  simultaneously. 

It  was  found  that,  for  the  phase- retrieval  problem  of  reconstructing  an  image  from  filled- 
aperture  magnitude  and  no  phase,  the  algorithm  converges  reasonably  quickly  to  the  correct 
solution.  For  the  interpolation  problem  it  was  found  that  the  algorithm  converged  quickly  to  a 
solution,  but  that  the  solution  is  not  necessarily  close  to  the  original  object,  indicating  that  the 
problem  of  interpolation  is  not  a  well-posed  problem.  The  most  realistic  problem  is  the  case 
where  the  magnitude  is  measured  over  a  partial  aperture  and  the  phase  is  not  available  at  all. 
In  this  case,  the  problem  is  that  of  simultaneous  phase  retrieval  and  interpolation.  For  the  case 
where  the  missing  Fourier  magnitude  covered  .  region  about  the  origin  with  l/64th  the  area  of 
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the  filled  aperture,  a  good  reconstruction  was  obtained  using  the  iterative  transform  algorithm 
augmented  by  the  stripe- removal  methods  of  [5].  Thus  it  is  possible  to  combine  phase  retrieval 
and  interpolation  in  the  reconstruction  of  an  image  &om  partial  Fourier  magnitude  information 
if  the  interpolation  is  confined  to  a  small  region  of  the  aperttire  plane. 
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Figure  1.  Aperture-piane  measurements  and  corresponding  images;  (a)  filled-aperture  Four 
m.acmtude.  (b)  Fourier  magnitude  over  aperture  A.  (c)  Fourier  magnitude  over  aperture 
(d)  filied-aperture  image,  (e)  aperture  A  image,  (f)  aperture  H  image. 
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Figure  3.  Interpolation  from  Fourier  magnitude  and  phase  over  aperture  A:  (a)  filled-aperture 
Fourier  magnitude,  (b)  Fourier  magnitude  over  aperture  A,  (c)  Fourier  magnitude  of  recon¬ 
structed  image,  (d)  filled-aperture  image,  (e)  aperture  A  image,  (f)  reconstructed  imaee. 
(g)  support  formed  from  thresholding  aperture  A  image,  (h)  enlarged  support  constraint,  (i) 
further-enlarged  support  constraint. 


Figure  4.  Interpolation  from  Fourier  magnitude  and  phase  over  aperture  H.  'See  caption  to 
Figure  31 
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Cuts  Through  Origin  of  Fourier  Magnitudes 


Figure  5.  Cuts  through  the  filled-aperture  Fourier  magnitude  (dotted  line)  and  Fourier  mag¬ 
nitude  of  the  reconstructed  image  of  Figure  3  (solid  lineV 


Figure  6.  Interpolation  and  phase  retrieval  from  partial-aperture  Fourier  magnirude.  (a) 
tillcd-apcrture  image,  (b)  partizJ- aperture  image  with  correct  Fourier  phase,  (c)  reconstructed 
image,  (d)  triple-intersection  support  constraint,  (e)  enlarged  support  constraint,  (f)  further- 
enlarged  support  constraint. 
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The  iterative  blind  deconvolution  algorithm  proposed  by  Ayers  and  Dainty  |Opt.  Lett.  13, 547  ( 1988)|  and  improved 
on  by  Davey  etai  |Opt.  Common.  69, 353  ( 1989)|  is  applied  to  the  problem  of  phase  retrieval,  which  is  a  special  case 
of  the  blind  deconvolution  problem.  A  close  relationship  between  this  algorithm  and  the  error-reduction  version  of 
the  iterative  Fourier-transform  phase-retrieval  algorithm  is  shown  analytically.  The  performance  of  the  blind 
deconvolution  algorithm  is  compared  with  the  error-reduction  and  hybrid  input-output  versions  of  the  iterative 
Fourier-transform  algorithm  by  reconstruction  experiments  on  real-valued,  nonnegative  images  with  and  without 
noise. 


1.  INTRODUCTION 

Blind  deconvolution  is  the  problem  of  finding  two  unknown 
functions,  fix)  and  g(i),  from  a  noisy  measurement,  c(x),  of 
the  convolution  of  these  functions,  defined  as 

c(i)  *  I  f(t')g(i  -  i')dx'  +  n(jE) 

* /(x)  •  g(i)  +  n(i),  (1) 

or  in  the  Fourier  domain  as 

C(u)  =  F(u)G(S)  +  N(ti),  (2) 

where  C,  F,  G,  and  N  are  the  Fourier  transforms  of  c,  /,  g,  and 
n,  respectively.  Ayers  and  Dainty*  recently  proposed  a 
practical,  two-dimensional  blind  deconvolution  algorithm 
for  the  noise-free  case,  where  the  additive  noise  term  n(z)  > 
0. 

In  this  paper  we  apply  the  Ayers-Dainty  (AD)  algorithm 
to  the  phase-retrieval  problem,  in  which  we  desire  to  recover 
an  image,  /(x),  from  the  modulus,  lF((i)l,  of  its  Fourier  trans¬ 
form: 

Ffa)  =  |F(£i)l  expfiiKu)]  «  3(/(i)l 

“  I  /(x)e*p[-i2ir(u  •  x)]dx.  (3) 

Phase  retrieval  is  equivalent  to  the  reconstruction  of  the 
Fourier  phase,  ^(a),  from  the  Fourier  modulus  and  to  the 
reconstruction  of  fft)  or  ^(ii)  from  the  autocorrelation  func¬ 
tion: 

r(i)  -  I'  ~  x)dx' 

«  7-‘(F(fi)f(a)]  «  s'-MlFfa)]*].  (4) 

The  phase-retrieval  problem  arises  in  several  disciplines  in¬ 
cluding  optical  and  radio  astronomy,  wave-front  sensing, 
holography,  and  remote  sensing. 


Comparing  Eqs.  (1)  (with  n(x)  =  0]  and  (4),  we  find  that 
phase  retrieval  can  be  considered  a  special  case  of  blind 
deconvolution,  in  which  we  deconvolve  fix)  and  /*(— i)  from 
r(x).  Because  the  AD  algorithm  represents  a  new,  practical 
algorithm  for  blind  deconvolution,  we  will  apply  it  to  phase 
retrieval  and  compare  it  with  two  existing  phase-retrieval 
algorithms.  We  will  begin  by  describing  the  AD  algorithm 
and  adaptations  of  the  algorithm  appropriate  for  phase  re¬ 
trieval.  Because  its  structure  closely  resembles  that  of  the 
error-reduction  (ER)  algorithm  commonly  used  for  phase 
retrieval,‘~*  the  AD  algorithm  is  compared  both  analytically 
and  experimentally  with  ER.  The  performance  of  both  of 
these  algorithms  is  compared  with  the  faster  hybrid  input- 
output  (HIO)  algorithm*"^  for  real,  nonnegative  objects  for 
the  cases  of  known  and  unknown  support,  using  Fourier 
intensity  data  with  different  levels  of  additive  Gaussian 
noise. 

2.  DESCRIPTION  OF  THE  ALGORITHM 

A.  Blind  DeconvoluUan 

The  AD  blind  deconvolution  algorithm*  ( 'ig.  1)  alternates 
between  the  object  domain  and  the  Fourier  domain,  enforc¬ 
ing  known  constraints  in  each  domain.  Object-domain  con¬ 
straints  such  as  support  and  nonnegativity  are  combined 
with  the  Fourier-domain  constraint  of  Eq.  (2)  to  produce 
new  estimates  of  f  and  g,  fk  and  ik-,  respectively,  at  each 
iteration.  Note  that  each  AD  loop  produces  two  estimates 
of  F  (and  G):  (1)  F*,  the  Fourier  transform  of  /»,  and  (2)  the 
estimate  obtained  by  imposing  the  Fourier-domain  con¬ 
straint  of  Elq.  (2).  These  two  estimates  are  averaged  by 
using  the  scalar  d  (0  <  d  <  1)  to  form  F*,  a  composite 
estimate  of  F.  Ayers  and  Dainty  proposed  the  following 
estimate  of  F  from  F»  and  G*,  the  Fourier  transform  of  g*: 

if  |C(&)I  <  noise  level, 

F»(0)  -  F»(a);  (5a) 
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Fig.  1.  AD  blind  deconvolution  algorithm. 


iflG»(ii)l>IC(u)i, 

f»(u)  =  (l-d)A(ti)  +  d 


C(t^)  , 


iflG»(u)l<IC(u)l. 

I  1  -<3  .  „ 

-  *:  — -  +  S  - - 

f*(u)  F^iU)  C(£i) 


(5b) 


(oc) 


Rather  than  implementing  Eqs.  (5),  we  use  a  Wiener-type 
filter  based  on  the  following  imaging  model: 

c(i)  =  six)  •  fit)  +  nix),  (6) 

or  in  the  Fourier  domain 

C(ti)  =  S(ti)F(u)  +  Niu),  (7) 


where  c  is  the  measured  image,  /  is  the  object,  s  is  the  impulse 
response  [the  Fourier  transform  of  which  is  S(u),  the  optical 
transfer  function],  and  n  is  the  noise.  Assuming  that/  and  n 
are  independent,  zero-mean,  Gaussian  random  processes, 
the  minimum  mean-squared-error  linear  estimator  for  fit) 
is'’/(z)  »  7''|/'(ti)j,  where 

Piu)  -  mu)Ciu).  (8) 

the  Wiener-Helstrom  filter  is 


IS(u)P-)-  ((N(a)P>/(lF(u)l^>  ’ 

and  <lN((i)l^)  and  <lF(ii)|-)  are  the  ensemble-averaged  ener¬ 
gy  spectra  of  the  noise  and  the  object,  respectively.  Al¬ 
though  the  images  generally  will  not  satisfy  the  statistical 
assumptions  stated  above,  the  filter  is  still  effective  and 
simple  to  implement.  The  Wiener-Helstrom  filter  of  Eq. 
(9)  is  often  used  for  image  restoration. 

To  apply  Eq.  (9)  to  the  problem  of  estimating  F  from  C 
and  0,  we  relate  Eq.  (2)  toEq.  (7)  (and,  hence,  Eq.  (1)  toEq. 
(6)]  by  allowing  GiQ)  to  play  the  role  of  Sia).  The  resulting 
Fourier -domain  constraint  (with  d  ~  1)  is 


F,iu) 


(J*(u) 


IIji,(0)I~  +  ff-/IPtlu)l' 


Ciu), 


(10) 


where  Gt  is  the  latest  estimate  of  G.  the  constant  <r-  is  an 
estimate  of  (lA/l->,  and  lAl-  is  used  to  estimate  A 

filter  similar  to  this  was  used  with  the  AD  algorithm  by 
Davey  et  al.^  for  the  blind  deconvolution  of  noisy,  complex- 
valued  images.  We  have  approximated  (IfVl-)  with  a  con¬ 
stant  based  on  the  assumption  that  n(z)  is  a  delta-correlat¬ 
ed,  Gaussian  random  process.  If  the  ensemble-averaged 
energy  spectrum  of  the  noise  is  known,  it  should  replace  a-  in 
Eq.(lO). 

To  estimate  G  from  C  and  A.  the  latest  estimate  of  F.  in 
Eq.  (10)  we  replace  Ft,  with  Gi,,  Si,  with  Pi,,  and,  following  the 
indexing  of  Fig.  1,  A  with  Gi,-i: 


Gi,iu) 


P^iu) 

lPl,iu)l~  +  <t'/IG*_,(u)H 


C(ii). 


(11) 


We  have  also  used  an  even  simpler  Wiener-type  filter, 
formed  by  replacing  the  term  ir-/IAl'  in  the  denominator  of 
Eq.  (10)  with  a  constant,  a: 


A(u) 


(I*(u) 

— C(u) 
l(J»(£i)l‘  +  a 


(12) 


We  will  refer  to  this  simpler  filter  as  AD  Filter  1.  and  the 
filter  in  Eq.  (10)  as  AD  Filter  2.  We  make  the  same  substitu¬ 
tions  that  are  made  for  Eq.  (10)  to  obtain  the  following 
expression  for  Gkia)  from  Eq.  (12): 


G.ia) 


Pliu) 


da). 


(13) 


B.  Phase  Retrieval 

As  we  noted  in  Section  1,  phase  retrieval  can  be  viewed  as  the 
process  of  blindly  deconvolving  a  function  fix)  and  its  twin. 
/•(— i).  Thus  for  phase  retrieval  the  noisy  measurements  of 
r(S)  and  |F(&)I^  take  on  the  roles  of  c(z)  and  C(u),  respective¬ 
ly,  and  F»(&)  and  Gtiu)  become  estimates  of  F(u)  and  F*(u), 
respectively.  Because  the  two  convolution  factors  are  twins, 
the  AD  algorithm  actually  produces  two  estimates  of  /  per 
iteration.  Therefore  we  need  only  consider  half  of  the  AD 
loop  (Fig.  2);  i.e.,  instead  of  estimating  F*(£i)  and  /*(-i)  we 
forego  the  second  half  of  the  loop  and  find  a  new  estimate  of 
Fid)  by  conjugating  Gkid),  the  estimate  of  F*(u).  Replacing 
C  with  iFl^,  we  conjugate  Eq.  (13)  to  obtain  the  AD  Filter  1 
phase-retrieval  Fourier-domain  constraint: 

F|,(u) »  (?*(«) 


P^id) 

\P,id)':^  +  a 


lF(u)l\ 


(14) 


AD  Filter  2  is  modified  in  a  similar  manner  by  conjugating 
Eq.  (11)  and  substituting  lAI^  for  i(7*-iP: 


A(0) 


p,ia) 

1A(0)|2  +  <r=/lA(fi)l' 


lF(u)|2. 


(15) 


Note  that  for  photon  (shot)  noise  in  the  measurement  of 
C{d),  which  would  have  a  variance  proportional  to  the  mean 
of  iFt^  the  quantity  cMAfill^isequivalent  too  in  Eq.  (14). 
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Fig.  2.  AD  blind  deconvolution  algorithm  applied  to  phase  retriev¬ 
al. 


know  the  original  object,  f(x).  Recalling  that  the  estimate  of 
fix)  after  the  kth  iteration  is  we  define  the  NRMS 
error. 


^  -  x„)  -  f(x)\- 

ABSERRs  - 


where  lo  maximizes  the  cross  correlation  between  f  and  fi, 
and 


C.  Comparison  with  Error  Reduction 
The  flow  chart  in  Fig.  2  of  the  AD  algorithm  applied  to  phase 
retrieval  is  identical  in  form  to  the  ER  algorithm.  The 
difference  between  the  ER  algorithm  and  the  AD  algorithm 
lies  with  the  Fourier-domain  constraint.  In  the  ER  algo¬ 
rithm  the  Fourier-domain  constraint  is  imposed  by  substi¬ 
tuting  the  known  modulus,  lF(Ci)l,  for  l/'*(u)l,  the  modulus  of 
the  Fourier  transform  of  /*(i),  the  estimate  of  the  object.  If 
we  write  A(u)  *  IA(u)l  exp|t'I>*(ii)l,  then  the  Fourier-do¬ 
main  step  in  the  ER  algorithm  gives 

F*(u)  =  lF(u)lexp[«*»(£i)l  * 


If  for  simplicity  we  assume  that  we  are  using  an  inverse  filter 
(which  corresponds  to  the  noise-free  case  and  is  obtained  by 
setting  a  =  0  in  Eq.  (14)  or  <r  =  0  in  Eq.  (15)J,  then  the  AD 
Fourier-domain  constraint  can  be  written  as 


F^la)  » 


IF(Q)P 

IA(o)P’ 


(17) 


Comparison  of  Eqs.  ( 16)  and  ( 17)  shows  that,  for  the  noise- 
free  case,  the  Fourier-domain  constraint  of  the  AD  algo¬ 
rithm  is  similar  to  that  of  the  ER  algorithm;  they  both 
produce  estimates  with  the  same  phase,  and  the  magnitudes 
of  both  estimates  are  boosted  (or  attenuated)  where  lFl/l/'»l 
>  1  (or  <  1).  Because  the  object-domain  operations  are 
identical  and  the  Fourier-domain  constraints  are  so  similar, 
we  expect  the  AD  and  ER  algorithms  to  behave  similarly. 


Fig.  3.  Comparison  of  phase-retrieval  using  AD  blind  deconvolu¬ 
tion  with  the  HIO  and  ER  iterative  transform  algorithms  for  a  real- 
valued.  nonnegative  object  with  known  support  and  no  Fourier 
modulus  error.  Reconstructed  images;  (A)  HIO/ER  (indistin¬ 
guishable  from  the  original  object);  (B)  ER;  (C)  AD  with  the  Fourier 
constraint  of  Eq.  (14);  (D)  AD  with  the  Fourier  constraint  of  Eq. 
(15). 


3.  EXPERIMENTAL  SIMULATIONS 

The  two  versions  of  the  AD  algorithm  (AD  Filters  1  and  2) 
were  compared  experimentally  with  each  other,  with  ER, 
and  with  a  combination  of  HIO  and  ER  (HIO/ER)  for  two 
cases;  (1)  a  real-valued,  nonnegative  object  with  a  priori 
known  triangular  support  of  side  128  pixels  embedded  in  a 
256  X  256  array  and  (2)  a  real-valued,  nonnegative  object 
with  unknown  support  (approximately  40  X  60  pixels)  in  a 
128  X  128  array.  The  triangular  support  in  case  (1)  was 
chosen  to  allow  for  rapid  convergence  even  for  the  slower 
algorithms.'  For  case  (1)  we  also  added  Gaussian  noise  to 
the  Fourier  intensity  data.  The  reconstruaions  for  case  (2) 
are  more  difficult  because  the  support  is  unknown  and  be¬ 
cause  it  is  of  a  less-favorable  shape.'  For  each  case,  the  same 
initial  guess  is  used  to  begin  all  the  algorithms. 

A  useful  error  metric  for  measuring  the  success  of  the 
reconstruction  is  the  normalized  root-mean-squared 
(NRMS)  error  with  the  original  object.  This  error  metric 
takes  advantage  of  the  fact  that,  in  a  simulation  like  this,  we 
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Fig.  4.  ABSERR  versus  iteration  number  for  the  reconstructions 
of  Fig.  3. 
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Fig.  5.  Compariaon  of  the  effect  of  the  pre-Wiener  filtering  of  noisy 
Fourier  intensity  data  on  reconstructions  with  the  ER  algorithm. 
Reconstructed  images  after  1000  iterations:  (A)  5%  FME.  no  pre- 
Wiener  filtering;  (B)  5%  FME,  pre-Wiener  filtering:  (C)  20%  FME, 
no  pre-Wiener  filtering;  (D)  20%  FME,  pre-Wiener  filtering. 


Fig.  6.  Comparison  of  phase  retrieval  using  AD.  HIO,  and  ER  for  a 
real-valued,  nonnegative  object  with  known  support  and  5%  FME. 
Reconstructed  images:  (A»  HIO/ER,  (B)  ER.  (C)  AD  with  the 
Fourier  constraint  of  Eq.  ( 14),  (D)  AD  with  the  Fourier  constraint  of 
Eq.  (15). 


The  reconstructions  for  case  (1)  with  noise-free  Fourier 
intensity  data  are  shown  in  Fig.  3  [AD  Filter  1  corresponds  to 
Eq.  (14),  and  AD  Filter  2  to  Eq.  (15)).  The  ER  and  AD 
images  exhibit  similar  striping  artifacts,  which  are  frequent¬ 
ly  seen  in  iterative  reconstruction.^  Methods  developed  for 
eliminating  the  stripes^  were  not  attempted  here.  The  HIO/ 
ER  image  avoids  this  stagnation  effect  and  converges  more 
quickly  to  a  solution  indistinguishable  from  the  original  ob¬ 
ject.  Figure  4  is  a  plot  of  ABSERR  versus  iteration  number 
for  the  reconstructions  of  Fig.  3.  The  AD  and  ER  algo¬ 
rithms  stagnated  after  approximately  50  iterations,  while 
HIO/ER  converged  to  the  solution  in  fewer  than  100  itera¬ 
tions.  Because  we  used  filter  parameters  a  and  a-  that  were 


Fig.  7.  Comparison  of  phase  rei'ievaJ  'ising  AD.  HIO.  and  ER  for  a 
real-valued,  nonnegative  object  with  known  support  and  20%  FME. 
Reconstructed  images:  (Ai  HIO/ER.  (Bi  ER.  iC)  AD  with  the 
Fourier  constraint  of  Eq.  (14).  ID)  AD  with  the  Fourier  constraint  of 
Eq.(15). 
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is  a  scalar  that  can  be  shown  to  minimize  ABSERR. 
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Fig.  8.  ABSERR  vertua  iteration  number  for  the  reconstructions 
of  Fig.  7. 
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Fig.  9.  Comparison  of  phase  retrieval  using  AD,  HIO,  and  ER  for  a 
real-valued,  nonnegative  object  with  unknown  support  and  no 
FME.  (A)  Object.  Reconstructed  images:  (B)  HIO/ER,  (C)  ER, 
(Dt  AD  with  the  Fourier  constraint  of  Eq.  (14),  (E)  AD  with  the 
Fourier  constraint  of  Eq.  (15). 
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Fig.  10.  ABSERR  versus  iteration  number  for  the  reconstructions 
of  Fig.  9. 


small  (to  account  for  computer  roundoff  error)  for  the  noise¬ 
less  case,  there  is  little  difference  between  the  two  AD  filters, 
and  the  corresponding  reconstructions  are  almost  identical. 
We  expect  the  differences  between  the  filters  to  become 
more  apparent  for  the  case  of  noisy  Fourier  intensity  data. 

We  now  consider  the  same  image  with  Gaussian  noise 
added  to  the  Fourier  intensity.  When  the  noisy  Fourier 
intensity  is  denoted  by  lF(ii)l^,  the  Fourier-modulus  error 
(FME)  with  respect  to  the  original  Fourier  intensity,  lF((ii)l’, 
is 


FME  5 


V  [|F(ii)l,  -  lF(ii)l]^ 


V  |F(u)|2 


(20) 


We  performed  reconstructions  for  single  realizations  of  iFl^ 


with  5%  and  20%  FME.  Because  the  AD  algorithm  has  a 
Wiener-type  filter  built  into  it,  a  less-prejudiced  comparison 
between  algorithms  is  obtained  if  we  filter  the  noisy  Fourier 
intensity  belore  use  with  the  ER  and  HIO  algorithms.  The 
pre-Wiener-filtered  modulus  that  is  used  in  this  case  is 


lAu)| 


1 

1  +  a-l\F{u)\*„ 


(21) 


where  a-  is  the  variance  of  the  noise  added  to  the  Fourier 
intensity.  Figure  5  demonstrates  the  effect  of  Eq.  (21)  on 
ER  reconstructions  for  the  two  noisy  cases.  The  smoothing 
of  the  pre-Wiener  filter  has  a  negligible  effect  for  the  5% 
FME  data  but  is  more  significant  for  the  20%  FME  data. 

The  reconstructions  from  all  four  algorithms  for  the  case 
of  5%  FME  are  shown  in  Fig.  6.  Since  the  pre-Wiener 
filtering  of  Eq.  (21)  was  insignificant  at  the  5%c  FME  noise 
level,  it  was  not  used  in  these  HIO  and  ER  reconstructions. 
The  5%  level  of  noise  has  little  effect  on  visual  image  quality, 
and  the  performance  of  the  algorithms  relative  to  one  anoth¬ 
er  is  similar  to  that  for  the  noiseless  case.  Reconstructions 
with  20%  FME  are  shown  in  Fig.  7.  This  level  of  noise 
significantly  degrades  the  visual  image  quality,  and  the  pre- 
Wiener  filtering  was  implemented  for  the  HIO  and  ER  re¬ 
constructions.  The  AD  Filter  1  image  of  Fig.  7(C)  has  no 
striping  artifacts  and  is  comparable  in  quality  with  the  HIO/ 
ER  reconstruction  of  Fig.  7(A),  whereas  AD  Filter  2  stag¬ 
nates  with  stripes  after  starting  with  the  same  initial  guess. 
The  low-pass  nature  of  the  Wienv r  type  filter  has  a  smooth¬ 
ing  effect  that  is  evident  in  the  AD  reconstructions.  The 
amount  of  smoothing  depends  on  the  filter  parameters  u  and 
oh  the  larger  these  parameter  are.  the  larger  th-^  attenua¬ 
tion  of  high  frequencies  and  th?  smoother  the  reconstruc¬ 
tion.  In  this  case  the  two  AD  reconstructions  achieve  a 
smaller  ABSERR  than  either  ER  or  HIO/ER  (Fig.  8)  but  at 
the  expense  of  image  sharpness.  The  reconstructions  stag¬ 
nate  almost  immediately,  but  a  change  in  a  after  400  itera¬ 
tions  moves  the  AD  Filter  1  image  out  of  stripe  stagnation. 
The  ability  to  vary  the  built-in  Wiener-tvne  filter  parame¬ 
ters  may  be  an  advantage  of  the  AD  algorithm.  The  AD 
algorithm  also  may  be  making  better  use  of  the  Wiener  filter, 
and  a  few  iterations  of  AD  Filter  1  on  the  HIO/ER  image  of 
Fig.  7(A)  yields  an  image  that  is  similar  to  th  "ig.  7(C). 

Figure  9  shows  the  reconstructions  from  all  four  algo¬ 
rithms  for  case  (2),  a  real-valued,  nonnegative  image  with 
unknown  support  in  a  128  x  128  array.  The  support  was 
estimated  from  the  support  of  the  autocorrelation,  r(i).  us¬ 
ing  a  triple-intersection  algorithm.^  Figure  10  is  a  plot  of 
ABSERR  versus  iteration  number  for  the  reconstructions  of 
Fig.  9.  The  HIO/ER  algorithm  converged  close  to  the  solu¬ 
tion  in  fewer  than  200  iterations,  whereas  AD  and  ER  both 
converged  more  slowly  and  stagnated  after  approximately 
400  iterations.  The  error  of  the  ER  reconstruction  is  signifi  - 
cantly  lower  than  that  of  the  AD  algorithms.  For  this  more- 
difficult  case,  we  find  again  that  the  AD  and  ER  algorithms 
perform  comparably  (ER  somewhat  better  than  AD),  and 
HIO/ER  is  still  more  effective  than  either. 


4.  CONCLUSION 

We  have  shown  that  the  Ayers-Dainty  (AD)  blind  deconvo¬ 
lution  algorithm  applied  to  phase  retrieval  is  similar  to  the 
error-reduction  (ER)  iterative  Fourier-transform  algorithm, 
both  in  form  and  in  performance.  A  nice  feature  of  the  AD 
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algorithm  is  a  built-in  Wiener-type  filter,  which  seems  to 
perform  slightly  better  than  the  pre-Wiener  filter  used  with 
hybrid  input-output  (HIO)  and  ER  for  the  noisier  case. 
The  two  different  Wiener-type  filters  considered  here  per¬ 
formed  comparably,  and  the  significant  difference  between 
hem  is  that  Filter  1  |Eq.  (14)j  is  simpler  to  implement  than 
Filter  2  [Eq.  (15)].  For  the  more  difficult  case  of  recon¬ 
structing  an  object  with  unknown  support,  the  AD  algorithm 
was  not  quite  so  effective  as  ER  and  did  not  converge  close  to 
a  solution  as  did  the  combination  of  HIO  and  ER  (HIO/ER). 
HIO/ER  is  still  the  most  effective  reconstruction  algorithm 
at  low  noise  levels,  and  at  higher  levels  of  noise  the  AD 
algorithm  can  be  used  in  conjunction  with  HIO  to  improve 
the  quality  of  the  reconstruction. 
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Both  a  new  iterative  grid -search  technique  and  the  iterative  Fourier-transform  algorithm  are  used  to  illuminate  the 
relationships  among  the  ambiguous  images  nearest  a  given  object,  error  metric  minima,  and  stagnation  points  of 
phase -retrieval  algorithms.  Anal.vtic  expressions  for  the  subspace  of  ambiguous  solutions  to  the  phase-retrieval 
problem  are  derived  for  2  x  2  and  3x2  objects.  Monte  CarlodigitaJ  experiments  using  a  reduced-gradient  search  of 
these  subspaces  are  used  to  estimate  the  probability  that  the  worst-case  nearest  ambiguous  image  to  a  given  object 
has  a  Fourier  modulus  error  of  less  than  a  prescribed  amount.  Probability  distributions  for  nearest  ambiguities  are 
estimated  for  different  object-domain  constraints. 


1.  INTRODUCTION 

The  phase-retrieval  problem  considered  in  this  paper  is  the 
reconstruction  of  an  object  function  f(x,  >■)  from  the  modulus 
Iflti,  p)l  of  its  Fourier  transform: 

Flu,  p)  =  lF(u,  p)lexp(ii^(u,  p)]  =  7 

=  j  j  fix.  y)expl-i2ir(ux  +  pyljdrdy.  (1) 

It  is  equivalent  to  the  reconstruction  of  the  Fourier  phase 
’flu,  p)  from  the  Fourier  modulus  and  to  the  reconstruction 
of flx,y)  or  iflu,  u)  from  the  autocorrelation  function 

r(i,y)  =  ^-'iFlu.plP.  (2) 

This  problem  arises  in  several  disciplines,  including  optical 
and  radio  astronomy,  wave-front  sensing,  holography,  and 
remote  sensing. 

There  are  the  omnipresent  ambiguities:  that  the  object 
flx,y),  any  translation  of  the  object  fix  -  xq,  y  -  yo),  the  twin 
image  f*l-x-  xo,  -y  -  yo).  and  any  of  these  multiplied  by  a 
constant  of  unit  magnitude  ezpd^c)  all  have  exactly  the 
same  Fourier  modulus.  These  ambiguities  change  only  the 
object's  position  or  orientation,  not  its  appearance.  If  they 
are  the  only  ambiguities,  then  we  refer  to  the  object  as  being 
unique.  A  solution  is  considered  to  be  ambiguous  only  if  it 
differs  from  the  object  in  ways  other  than  these  omnipresent 
ambiguities. 

If  nothing  is  known  about  the  object,  then  reconstruction 
from  its  Fourier  modulus  is  generally  ambiguous  except  for 
special  cases.  Fortunately,  for  many  applications  one  has 
additional  a  priori  knowledge  about  or  constraints  on  the 
object.  In  the  astronomy  application,  for  example,  the  ob¬ 
ject’s  spatial  brightness  distribution,  fix,  y),  is  a  real,  non¬ 
negative  function.  For  several  applications,  one  has  a  sup¬ 
port  constraint,  i.e.,  the  object  is  known  to  be  zero  outside 
some  Finite  area.  Even  if  the  support  constraint  is  not 
known  a  priori,  upper  bounds  can  be  placed  on  the  support 
of  the  object  since  it  can  be  no  larger  than  half  the  diameter 
of  the  autocorrelation  along  any  direction.  Additional  mea¬ 
surements  or  other  forms  of  a  priori  information  may  be 


available  for  specific  applications;  in  this  paper  we  consider 
real-valued  objects  with  known  support,  both  with  and  with¬ 
out  a  nonnegativity  constraint. 

Until  the  late  1970's,  there  was  much  doubt  that  the 
phase- retrieval  problem  could  be  solved  or  that  the  solution 
would  be  useful,  because  the  one-dimensional  theory  of  ana¬ 
lytic  functions  available  at  the  time  indicated  that  there 
were  ordinarily  a  huge  number  of  ambiguous  solutions.'- ’ 

The  first  indications  that  the  two-dimensional  (2-D)  case 
is  usually  unique,  despite  the  lack  of  uniqueness  in  one 
dimension,  came  from  empirical  reconstruction  results'*  ^: 
images  that  were  reconstructed  resembled  the  original  simu¬ 
lated  objects  used  to  compute  the  Fourier  modulus  data. 
These  results  gave  hope  that  2-D  phase-retrieval  problems 
might  be  solvable  and  unique.  (Other  phase-retrieval  prob¬ 
lems.  such  as  in  electron  microscopy  in  which  one  has 
squared-modulus  measurements  in  each  of  two  domains^ 
and  in  x-ray  crystallography  in  which  one  has  the  a  priori 
information  that  the  object  consists  of  a  finite  collection  of 
atoms,'  had  been  solved;  but  those  earlier  successes  depend¬ 
ed  on  much  greater  object-domain  constraints  than  just  non¬ 
negativity  and  support.)  Those  empirical  results  gave  im¬ 
petus  to  attempts  to  extend  the  one-dimensional  (1-D)  the¬ 
ory  to  two  dimensions.  Although  progress  has  been  made,*^ 
the  level  of  understanding  of  the  2-D  problem  has  not  yet 
matched  that  of  the  1-D  problem. 

One  of  the  most  enlightening  developments  has  been  the 
work  of  Bruck  and  Sodin,'-*  who  modeled  the  object  distribu¬ 
tion  as  an  array  of  delta  functions  on  a  regular  grid.  Then 
the  continuous  Fourier  transform  becomes  the  discrete  Fou¬ 
rier  transform  (DFT), 


Flu.  v)  «  lElu,  t’)lexp(jV'(u,  o))  =  DFT[/(x,  y)) 

i«o  .1-u 


(3) 


where  the  DFT  is  taken  over  a  2M  X  2N  array  but  fix,  y)  is 
zero  outside  an  M  X  IV  array  in  order  to  avoid  aliasing  in  the 
computation  of  r(x,  y)  snd  lF(u,  ii)P.  For  this  discrete  case 
the  Fourier  transform  gr  in  Eq.  (3)  can  then  be  expressed 
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as  a  polynomial  of  two  complex  variables,  z  =  exp{jTru/M) 
and  u-  =  expiy  iri  /A’).  It  is  also  equivalent  to  the  z  transform. 
Then  the  presence  of  ambiguity  in  the  phase-retrieval  prob¬ 
lem  is  equivalent  to  the  factorability  of  the  polynomial. 
This  explains  the  vast  difference  between  the  1-D  and  2-D 
cases,  because  polynomials  (of  degree  2  or  greater)  of  a  single 
complex  variable  are  always  factorable,  whereas  polynomi¬ 
als  of  two  (or  more)  complex  variables  are  rarely  factor¬ 
able.'^ Other  interesting  results  have  been  obtained  by 
exploiting  this  discrete  model.  Fiddy  et  al.'  and  Nieto- 
Vesperinas  and  Dainty"'  described  an  object  support  that, 
by  virtue  of  Eisenstein’s  irreducibility  theorem,  guarantees 
uniqueness.  Brames"*  showed  that  any  discrete  object  hav¬ 
ing  a  support  whose  convex  hull  has  no  parallel  sides  is 
unique  among  objects  with  supports  having  the  same  convex 
hull;  so  if  the  convex  hull  of  the  support  of  such  an  object  is 
known  a  priori,  then  it  is  unique.  For  these  cases,  there  also 
exists  a  closed-form  recursive  reconstruction  algorithm.'-" -* 
Whether  the  objects  are  discrete  or  continuous,  it  is  easy 
to  make  up  cases  that  are  ambiguous.  If  g(x,  y)  and  h(x,  y) 
are  two  functions  of  finite  support  with  Fourier  transforms 
Giu.u)  and  Hiu.c)  respectively,  then  the  convolutions 

fiU.y)  =  g(x,y)  •  h{x,y)  (4) 

and 

y)  =  g(i,  y)  •  h*(-x,  -y)  (5) 

are  different  objects  as  long  as  neither  g  nor  h  is  conjugate 
centrosymmetric,  they  have  Fourier  transforms 

F^iu.v)  *  G(u,v)H(u,v)  (6) 

and 

F4u.  f)  «  G(u,  v)H’{u,  v)  (7) 

that  have  the  same  modulus. 

If ,(u.  f)l  -  lf,(u.  I'll  -  lG(u,  u)llH((i,  t')l.  (8) 

and  the  objects  /,  and  are  ambiguous.  This  demonstrates 
the  equivalence  of  phase-retrieval  ambiguity  to  convolutions 
in  the  object  domain  (Eqs.  (4)  and  (5)]  and  factorability  in 
the  Fourier  domain  [Eqs.  (6)  and  (7)|.  Furthermore,  if  there 
are  K  irreducible  Fourier  factors,  then  there  are  ambig¬ 
uous  solutions.  By  this  convolutional  (products  or  factors  in 
the  Fourier  domain)  method,  it  is  possible  to  make  up  an 
uncountably  infinite  number  of  ambiguous  cases  even 
though  the  theory  indicates  that  ambiguity  is  rare  (of  zero 
probability)  in  two  dimensions.  Consider  that  it  is  also  true 
that  any  randomly  chosen  real  number  has  probability  zero 
of  being  a  rational  number  (almost  all  are  irrational  num¬ 
bers).  Yet  any  real  number,  even  if  irrational,  can  be  ap¬ 
proximated  arbitrarily  well  by  a  rational  number.  Thus  the 
fact  that  the  probability  of  any  given  object's  being  ambigu¬ 
ous  (the  Fourier  transform  being  factorable)  is  zero  is  not 
necessarily  comforting. 

Sanz  et  al.  have  shown  that  the  “uniqueness  condition  is 
stable  in  the  sense  that  it  is  not  sensitive  to  noise. How¬ 
ever,  their  analysis  does  not  shed  light  on  a  more  practical 
definition  of  uniqueness.  If  a  given  nonfactorable  polyno¬ 
mial  is  near  enough  (in  an  integrated  mean-squared  differ¬ 
ence  sense)  to  a  factorable  polynomial,  then  the  ambiguous 
solutions  assorinted  with  the  factorable  polynomial  will  be 
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consistent  (to  within  the  noise)  with  the  noisy  Fourier-mod- 
ulus  data.  Under  this  circumstance  the  object  may  be  con¬ 
sidered  to  be  ambiguous  in  a  practical  sense,  even  though  it 
may  be  unique,  traditionally  speaking.  Up  to  this  point  it 
was  not  known  how  close  an  arbitrary  polynomial  is.  on  the 
average,  to  a  factorable  polynomial.  Furthermore,  the  exis¬ 
tence  of  ambiguous  objects  close  to  a  given  object  is  likely  to 
cause  the  existence  of  local  minima  in  which  iterative  recon¬ 
struction  algorithms  will  become  trapped.  Current  theory 
has  not  adequately  addressed  these  questions,  even  for  the 
discrete  model.  These  questions  can  be  answered,  though, 
by  numerical  means,  as  will  be  seen  below. 

One  way  to  test  for  practical  uniqueness  is  the  use  of  the 
iterative  Fourier-transform  algorithm.^  - '  If  multiple  so¬ 
lutions  exist,  then  the  algorithm  tends  to  find  all  of  them  if 
many  reconstructions  are  performed,  each  starting  from  a 
different  array  of  random  numbers  as  the  initial  estimate.-'' 
In  most  instances  investigated,  when  the  algorithm  is  ap¬ 
plied  to  the  Fourier  modulus  of  an  object  of  interest,  if  it 
does  not  stagnate'-*  it  reconstructs  essentially  the  correct 
object."’  giving  strong  evidence  of  uniqueness  for  those  types 
of  object.  Furthermore,  when  noise  is  added  to  the  Fourier- 
modulus  data,  the  result  is  usually  a  noisy  image  of  the 
object  rather  than  a  completely  different  reconstruction."'^ 
contrary  to  some  predictions.""  While  this  approach  has 
provided  some  assurance  that  the  phase-retrieval  problem  is 
usually  unique  in  the  practical  sense  even  in  the  presence  of 
noise,  it  has  not  yielded  any  quantitative  results  on  the 
probability  of  uniqueness  for  any  given  level  of  noise. 

An  important  consideration  in  the  probability  of  unique¬ 
ness  is  the  set  of  constraints  placed  on  the  object.  In  all 
cases  we  assume  that  the  object  has  finite  support  (it  is  zero 
outside  some  finite  region).  The  support  of  the  object  plays 
a  crucial  role.  If  the  object  has  a  delta  function  known  to 
satisfy  the  holography  condition.'"  then  it  is  unique.  .As 
mentioned  above,  discrete  objects  having  certain  supports 
are  guaranteed  to  be  unique.'*  In  addition,  objects  having 
separated  parts  are  more  likely  to  be  unique. Although  it 
is  less  well  understood,  nonnegativity  also  plays  an  impor¬ 
tant  role  in  uniqueness. 

In  this  paper  we  establish  a  methodology  for  determining 
the  probability  of  phase-retrieval  uniqueness  in  the  practi¬ 
cal  sense.  We  have  developed  a  method,  suitable  for  small 
images,  for  answering  the  questions:  Given  an  arbitrary- 
object  and  its  Fourier  polynomial,  how  close  is  the  nearest 
factorable  polynomial,  and  does  it  have  an  ambiguous  solu¬ 
tion  that  is  significantly  different  from  the  given  object?  In 
this  paper  we  explore  this  question  for  the  case  of  objects 
defined  within  2X2  and  3X2  supports.  A  derivation  of 
object-domain  conditions  for  factorability  provides  a  means 
for  finding  nearest  factorable  polynomials  through  a  con¬ 
strained-minimization  search  over  the  space  of  2  x  2  or  3  x  2 
ambiguous  images.  These  searches  are  implemented  with 
different  object-domain  constraints  in  a  Monte  Carlo  simu¬ 
lation  to  estimate  the  probability  that  the  nearest  factorable 
polynomial,  with  an  ambiguous  solution  that  is  significantly 
different  from  a  given  object,  is  within  some  distance  of  the 
given  polynomial.  Before  describing  these  main  results,  we 
first  define  the  pertinent  error  metrics  and  discuss  some 
preliminary  results  of  a  grid-search  method  for  finding  local 
minima  in  phase  retrieval,  and  relationships  among  minima, 
ambiguities,  and  phase-retrieval  stagnation. 
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2.  OBIECT-TO-FREQUENCY-DOMAIN 
MAPPINGS  AND  ERROR  METRICS 

A  useful  means  for  visualizing  the  ambiguity  problem  is 
through  a  mapping  between  the  space  of  objects  /images) 
and  the  space  of  Fourier  moduli  as  illustrated  in  Fig.  1.  In 
Fig.  1  each  domain  is  a  finite-dimensional  space  in  which  any 
one  point  represents  a  2-D  function.  In  this  diagram  iFfu. 
t')l  represents  Fourier-modulus  data  for  a  unique  object  and 
\Gaiu.  i')l  modulus  data  for  an  ambiguous  object,  since  both 
g„  and  gar  raap  into  it.  We  refer  to  g„  and  ga,  as  ambiguous 
counterparts  of  each  other,  gotten  by  conjugating  one  or 
more  of  the  Fourier-domain  polynomial  factors.  For  the 
case  depicted  in  Fig.  1,  as  indicated  by  the  distances  between 
the  points,  two  widely  different  images,  /  and  gar,  may  have 
similar,  but  not  identical,  Fourier  moduli.  Thus,  although  f 
is  unique,  one  might  unknowingly  reconstruct  g,,  by  a  phase- 
retrieval  algorithm  given  a  noisy  measurement  of  Ifl. 

The  following  error  metrics  provide  a  means  for  quantify¬ 
ing  differences  in  both  domains.  These  metrics  are  the 
focus  of  the  numerical  approach  presented  in  this  paper. 
(Other  related  error  metrics  are  also  useful.)  Given  two 
real-valued  functions g(z,.v)  and  /(i,>  )  defined  on  an  Af  X  N 
support  and  zero  padded  to  a  2M  X  2jV  array,  we  define  the 
Fourier-modulus  error,  the  error  (distance)  between  If  (u,  o)! 
and  IC(u,  o)l,  as 


^  (a,lG(u,  t')l  -  lf(ti,  l')l)' 


^  lf(li,  t’)l"’ 


(9) 


where 


^  lf(u,  v) 


V  |G(u,  0)1- 


(10) 


is  an  energy  normalization  factor,  G(u.  u)  *  DFTlg(i,  y)). 
and  u  and  o  summations  are  taken  over  the  intervals  0,  1, 

. . . ,  2M  -  1  and  0, 1 . 2N  —  1.  respectively. 

A  similar  metric  defines  the  object-domain  error  between 
/(x,  v)  and  g(x.>  ); 


Fig.  1.  Object-space  to  Fourier-modulus-space  mappings  of  a 
unique  object  /  and  a  pair  of  ambiguous  images  (g,.  g,,),  with  error 
metrics  i  and  i. 


V  [Q„g(i,  y)  - 


V  fHx.y) 


where 


(11) 


a„  =  assign 


V  /(z,  y)g(x,  V) 


(12) 


and  X  and  y  summations  are  taken  over  0, 1 . M  -  I  and  0. 

1 . N  -  1.  respectively.  The  parameter  oc,  takes  into 

account  any  differences  in  scaling  and  polarity  between  g 
and  /.  Translations  are  ignored  here  because  the  support 
constraint  automatically  rules  them  out.  Because  g(x.  y) 
and  its  twin,g(Af  — 1—  x.jV— 1—  y),  share  the  same  Fourier 
modulus,  we  compute  i(g,  f)  for  both  g(x,  y )  and  its  twin  and 
use  the  smaller  of  the  two  values  of  i.  Of  particular  interest 
from  the  point  of  view  of  phase  retrieval  are  images  that  have 
a  small  Fourier-modulus  error  <,  but  a  large  object-domain 
error  i.  since  these  images  msy  be  ambiguous  in  the  practical 
sense. 


3.  GRID  SEARCHES 

Our  first  approach  to  understanding  the  relationship  be¬ 
tween  « and  6,  for  a  collection  of  images  g  relative  to  a  given 
object  f,  was  by  a  grid  search.  What  we  mean  by  a  grid 
seuch  is  illustrated  as  follows  for  the  case  of  3  x  2  (Af  =  3.  A' 
=  2)  objects.  Given  a  3  X  2  object  f,  we  calculate  <  and  i  for 
all  3  X  2  images  g  ■=  g„f  +  g|„c.  where  g„i  is  another  3x2  real- 
valued  image  and 


where,  given  a  real-valued  increment  each  s,  can  assume 

values  in  the  set  lAAs;  A  «  -L,-L+  1,. . .  ,0. 1 . £,1.  If  we 

think  of  both  f  and  g  as  points  in  a  six -dimensional  (6-D) 
space,  then  we  are  calculating « and  i  for  all  g's  sampled  on  a 
symmetric  6-D  grid  of  step  size  As  centered  about  the  point 
g„f,  with  the  grid  width  equal  to  2L  -b  1  steps  in  each  of  the 
six  dimensions. 

This  search  can  become  quite  extensive  as  the  grid  width 
increases.  Since  the  number  of  different  g.nc’s  (grid  points) 
is  (2L  +  I)«,  even  a  five-step  search  (Z.  *  2)  requires  15,625 
calculations  of  c  and  4.  If  the  search  uses  the  zero  image  for 
gref.  we  can  cut  down  on  redundant  calculations  of  (  by 
eliminating  twin  images  and  images  with  polarity  (sign  of 
F(0. 0)]  opposite  /.  Note  that  the  saving  is  in  the  calculation 
of «,  which  is  computationally  more  expensive  than  the  cal¬ 
culation  of  4. 

Grid-Search  Example 

The  use  of  a  successively  finer  grid  search  to  find  minima  in  t 
(which  could  constitute  a  phase-retrieval  algorithm)  and 
shed  light  on  the  properties  of  c  and  4  is  illustrated  in  the 
following  example.  An  integer-valued  image  /  was  chosen: 
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Fig.  2.  Fourier-modulus  error  <  versus  object-domain  error  6  for  a 
five-step  grid  search  with  step  size  As  =  1 .  The  minimum  value  of  < 
(excluding  g  =  f)  is  boxed. 


A  five-step  search  (L  =  2)  was  implemented  with  equal  to 
the  zero  function  and  with  As  =  I;  i.e.,  the  pixel  values  of  g  are 
taken  from  the  set  |-2,  -1,  0,  1,  2|.  Since  the  search  is 
centered  about  the  zero  function,  the  twin  and  polarity 
search -reduction  techniques  mentioned  above  were  imple¬ 
mented.  The  results  are  displayed  in  Fig.  2  in  the  form  of  a 
scatter  plot  of  c  versus  6.  Several  features  of  the  scatter  plot 
are  noted: 

(1)  ( is  less  than  or  equal  to  The  proof  of  this  fact  is 
given  in  Appendix  A. 

(2)  The  vertical  striping  reflects  the  discrete  nature  of 
the  search,  i.e.,  the  elements  of  g  take  on  only  integer  values. 

(3)  (  and  6  can  both  be  greater  than  unity,  despite  the 
normalization  that  takes  place  in  the  denominators  of  Eqs. 
(9)  and  (11). 

(4)  The  scatter  plot  exhibits  a  banded  type  of  structure, 
i.e.,  the  points  tend  to  cluster  in  a  region  where  both  i  and  < 
are  large.  This  is  not  surprising,  since  we  expect  most  im¬ 
ages  that  are  quite  different  in  the  object  domain  to  be  quite 
different  in  the  Fourier-modulus  domain  as  well. 

The  single  point  of  greatest  interest,  an  outlier  with  large  i 
and  relatively  small  <.  is  outlined  by  a  box  in  Fig.  2.  It 
corresponds  to  the  image 


with  d{gn,  f)  *  0.714  and  «(go.  f)  “  0.124.  It  is  the  point 
within  the  grid  search  with  the  lowest  value  of  t  aside  from  g 
w  /.  Since  it  represents  the  point  on  the  grid  search  closest 
to  being  a  serious  ambiguity,  we  explored  it  further  by  per¬ 
forming  another  five-step  search,  with  g„i  *  go  of  Eq.  (15) 
and  a  step  size  of  As  «  1/3.  Because  g„i  is  not  the  zero 
function,  no  data  reduction  was  implemented,  and  <  and  6 
were  calculated  for  the  15,625  different  grid  points.  Figure 
3  shows  the  scatter  plot  for  this  second  search  for  <  <  0.125. 
It  is  apparent  that  our  initial  search  with  unit  steps  was  quite 
coarse  and  that,  compared  with  go,  there  are  images  with 
significantly  smaller  values  of  t  and  comparably  large  values 


of  6.  The  minimum  value  of  t  for  this  grid  search  corre¬ 
sponds  to  the  image 


3  3 


with  6(g,,  f)  -  0.704  and  t (gi,  f)  =  0.0648. 

We  performed  a  third  five-step  search,  with  gru  =  gi  and 
As  w  1/9.  The  image  corresponding  to  the  minimum  «  for 
this  search  is 


2  7  -17 


L9  3  9  J 

with  i(g2,  f)  =  0.666  and  t(g2,  f)  =  0.0569. 


Iterative  Grid  Searches 

The  iterative  searching  above  is  an  approach  for  finding 
minima  of  «.  It  is  summarized  more  generally  by  the  follow¬ 
ing  steps  for  the  case  of  M  X  N  =  3  X  2. 


(1)  Initialize.  Choose  g,,!,  the  number  of  search  steps 
(2L  +  1),  the  step  size  (As),  and  a  step-size  reduction  factor 
(r). 

(2)  Perform  a  6-D  (2L  +  l)-step  search  with  g  =  g„(  + 
ginc.  where 


and  each  Sj,;’ «  1, 2, . . .  ,6,  is  from  the  set  Ik  As;  It  =  -L,  —L  + 

1 . 0,1 . U 

(3)  Set  g„f  equal  to  the  image,  g,  which  has  the  minimum 
value  of  (  found  in  the  search  of  the  previous  step. 

(4)  Set  As  equal  to  As/r. 

(5)  Stop  if  the  stopping  criterion  is  met;  otherwise  go  to 
step  2. 


O.fc  C  ) 

6  tOBJECT-OOnPIN  ERROR) 


Fig.  3.  Fourier-modulus  error «  versus  object-domain  error  6  for  a 
five-step  grid  search  with  As  •  1/3  about  the  minimum  of  the  grid 
search  of  Fig.  2.  All  poinu  satisfying  «<  0.125  are  shown  here. 
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The  stopping  criterion  is  based  on  the  percentage  change 
in  the  minimum  value  of «  from  iteration  to  iteration,  or  set 
for  a  maximum  number  of  iterations,  whichever  is  satisfied 
first.  For  a  large  value  of  L.  the  search  time  is  prohibitive, 
but  the  sampling  is  finer.  Also,  the  initial  step-site  and 
step-reduction  factor  must  be  chosen  carefully,  since  the 
step  size  at  the  feth  iteration  is  As/(r*"').  If  r  is  chosen  too 
large,  the  grid  may  shrink  too  quickly  to  progress  to  a  mini¬ 
mum.  If  As  is  too  small,  the  minimum  might  not  be  found 
because  it  lies  outside  the  initial  grid.  The  most  reliable 
search  uses  a  slowly  shrinking  grid  with  a  large  number  of 
grid  points  (large  L)  that  samples  the  space  over  a  large 
region.  The  more  finely  we  sample  the  space,  the  more 
computationally  burdensome  the  algorithm  becomes,  yet  a 
coarser  grid  would  leave  doubt  about  the  reliability  of  our 
minimum. 

This  iterative  search  could  constitute  a  phase-retrieval 
algorithm.  However,  it  would  be  a  computationally  ineffi¬ 
cient  algorithm,  requiring  many  thousands  of  DFT’s  to  con¬ 
verge  to  a  solution  for  the  case  of  larger  objects.  Here  we  are 
using  it  only  to  find  a  local  minimum  (the  global  minimum  is 
axg-f  for  which  «  =  5  =  0). 

The  iterative  grid  search  was  tested  for/given  by  Eq.  (14) 
and  with  the  following  three  sets  of  parameters:  (1)  L  =  1, 
As  =  1/2,  r  =  2:  (2)  Z,  =  2,  As  =  1/3,  r  =  3;  and  (3)  i  =  3,  As  = 
1/4,  r  =  4,  Each  iterative  search  started  with  g,,(  =  go  given 
by  Eq.  (15),  corresponding  to  the  minimum  (  found  in  the 
first  search  described  above.  Each  of  these  searches  found  a 
scalar  multiple  of  the  same  image,  gmm.  given  by 

_  [0.623  0.749  -1.871] 

^'"'"*[1.149  -0.659  -2.530j’ 

with  5(gniin.  f)  =  0.667  and  <(gmin.  /)  “  0.0558.  This  probably 
represents  a  deep  local  minimum  for  the  phase-retrieval 
problem  and  could  represent  a  practical  ambiguity  if  the 
noise  in  the  Fourier  modulus  data  were  to  exceed  <(gmin,  /)■ 

4.  MINIMA  AND  PHASE  RETRIEVAL 

The  minimum  in  <.  represented  by  gnm  found  in  the  iterative 
grid  searches  described  above,  represents  two  potential 
problems  for  phase  retrieval.  First,  a  relatively  small  error 
in  the  modulus  data  (5.58%)  could  cau.^  the  data  to  be 
consistent  with  gmm,  which,  if  reconstructed,  would  have  a 
very  large  object-domain  error  (66.7%).  Second,  even  when 
it  is  performing  phase  retrieval  with  error-free  modulus 
data,  the  algorithm  could  get  trapped  and  stagnate  at  this 
local  minimum.  In  particular,  the  error-reduction  (£R)  ver¬ 
sion  of  the  iterative  transform  algorithm  is  equivalent  to  a 
steepest-descent  gradient  search  method  on  a  cost  function 
closely  related  to  Thus,  if  the  local  minimum  found  in 
our  iterative  searches  were  a  true  local  minimum,  the  ER 
algorithm  could  stagnate  at  this  image,  unable  to  find  a 
direction  in  which  to  descend.  To  visualize  how  c  and  i  vary 
around  gm,n,  we  plot  c  and  b  along  the  line  joining  /  |Eq.  (14)] 
andgnim  [Eq-  (19)1-  Figure  4  shows  ((g,^  and  6(g,/)  versus  t 
for 

g*f+  f(gm,n  -  /)•  (20) 

While  Pig.  4  represents  only  a  1-D  slice  through  a  6-D  space, 
it  gives  the  appearance  of  a  minimum  in  t  at  t  >  1  (g  >  gmm)- 


Fig.  4.  «tg.  f)  and  6(g,  /)  versus  t  for  g  =  /  -*■  -  f).  the  line 

joining  /  and  g„„„. 


When  ER  is  performed  on  Ifl  with  gni,n  as  the  initial  guess, 
stagnation  occurs  immediately,  giving  further  evidence  of 
the  presence  of  a  local  minimum. 

As  another  test  of  ER's  tendency  to  stagnate  at  a  mini¬ 
mum  in  (,  we  use  g’s  corresponding  to  different  values  of  t  in 
Eq.  (20)  as  initial  guesses.  These  values  are  selected  on  both 
sides  of  the  peak  in  the  t  curve  in  Fig.  4.  We  might  expect 
values  of  t  chosen  on  the  right-hand  side  of  the  peak  to 
correspond  to  initial  guesses  that  stagnate  at  gmm  and  guess¬ 
es  chosen  to  the  left  of  the  peak  to  converge  to  the  correct 
solution,  f.  Several  values  of  t  were  selected  on  both  sides  of 
the  peak,  and  the  predicted  result  was  verified  for  all  initial 
guesses. 

The  hybrid  input-output  (HIO)  version  of  the  iterative 
Fourier-transform  algorithm'^^  is  one  way  of  climbing  out  of 
local  minima.  Simulated  annealing^^  is  another.  Cycles  of 
HIO  iterations  followed  by  ER  iterations-*  were  used  with  a 
variety  of  starting  points:  go,  gi,  g2,  and  gmm-  In  each  case 
the  HIO/ER  combination  converged  to  the  correct  solution. 
f,  although  ER  by  itself  stagnated  in  each  of  these  same 
cases.  As  we  will  see  below,  HIO  is  not  always  sufficient  to 
overcome  stagnation. 


5.  MINIMA  AND  AMBIGUOUS  IMAGES 

A  clue  to  the  understanding  of  the  stagnation  point  de¬ 
scribed  above  is  its  relationship  to  ambiguous  images.  Con¬ 
sider  again  the  object  f  given  by  Eq.  (14).  Using  methods 
that  are  described  below,  one  can  verify  that  the  3  x  2 
ambiguous  image  whose  Fourier  modulus  is  closest  to  the 
Fourier  modulus  of  the  object  f  is 

ro.594  1.624  -1.2111 

[2.330  1.415  -I.730J’ 


with  4(go,  f)  ■  0.217  and  f(go,  f)  »  0.0859.  The  ambiguous 
counterpart  toga  (gotten  by  conjugating  one  of  the  factors  of 
Ga(u,  o)]  is 


g 


•t 


-0.363  -0.618  1.987] 
-1.422  0.600  2.837 J’ 


(22) 
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with  4(£,c,  f)  =  0.677  and  f(ga<,  H  =  0.0859.  A  comparison  of 
gmm  [Eq.  (19*1  with  -g,,  (which,  for  our  purposes,  is  equiva¬ 
lent  to  gar*  reveals  a  similarity  between  this  pair  of  images. 
The  error  metrics  reveal  their  similarity  in  both  domains; 
4 (  -gac.  gmm  I  =  4(gar.  gm.i,  >  =  0. 1  1 3  and  « (gar,  gmin  I  =  0.0663. 

Because  -ga,  and  are  quit  similar,  we  might  expect 
the  ER  algorithm  with  an  initial  guess  of  -ga,  to  stagnate  at 
gm.n-  This  is  indeed  the  case  after  approximately  50  itera¬ 
tions.  This  result,  coupled  with  the  similarity  between  / 
(Eq.  (14)]  and  its  nearest  ambiguity,  go,  might  lead  us  to 
conclude  that  ER  would  find  the  correct  solution  if  it  were 
started  with  an  initial  guess  of  go.  This  is  not  the  case, 
however,  and  the  algorithm  stagnates  after  fewer  than  20 
iterations  at 


Fig.  6.  Object-space  to  Fourier-modulus-space  mappings  ot  an  ob¬ 
ject  /.  two  stagnated  images  and  e.ij,.  and  the  nearest  ambigu¬ 
ous  image  to  /  with  respect  to  the  Fourier-modulus  error  ig^.  g^,  i. 


ro.694  1.778  -1.010j  ^23) 

[2.235  1.355  -1.856J 

with  4(gsi,j,  f)  =  0.152  and  rigst.n.  f)  ~  0.0631.  This  stagna¬ 
tion  point  is  close  to  go,  with  4(ga,  g^,,g)  =  0.0828  and  <(ga. 
g,„j)  =  0.0577.  Because  g«,j  is  not  in  the  range  of  the 
iterative  grid  searches  that  found  gmm.  it  was  not  found 
earlier.  A  plot  of  <  and  4  along  the  line  joining  /  and  g,uf  is 
shown  in  Fig.  5.  Despite  the  difference  in  vertical  scaling, 
the  minimum  in  Fig.  5  does  not  appear  to  be  as  deep  as  that 
in  Fig.  4,  so  one  would  suspect  there  might  be  a  good  chance 
of  perturbing  g,t^  enough  to  get  the  algorithm  out  of  stagna¬ 
tion-  As  with  gm,n.  it  was  verified  that  the  HIO  is  able  to 
move  out  of  stagnation  at  gi,,g  and  to  the  solution. 

Figure  6  depicts  the  possible  relationships  in  both  do¬ 
mains  between  /,  its  nearest  ambiguous  image  and  counter¬ 
part,  and  the  two  stagnation  points.  From  the  previous 
results  we  form  the  following  conjecture:  For  a  given  object 
f  and  its  Fourier  modulus  Ifl,  stagnation  points  of  the  itera¬ 
tive  transform  algorithm  (particularly  ER)  tend  to  be  near 
ambiguous  images  that  have  Fourier  moduli  close  to  IFI. 
This  conjecture  is  supported  more  strongly  by  the  following 
example. 

Consider  the  following  image  /  and  its  nearest  ambiguity, 
ga,  with  ambiguous  counterpart  g.c: 


0.3  » 


a 


C 

Q: 
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Fig.  5.  €(g,  f)  and  4(g,  f)  versus  t  for  g  •  f  +  llg.u,  -  f),  the  line 
joining  f  and  g,u« 
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2.939 

1.102 

0.867 

3.521 

1.278' 

1.679 

2.651 

0.796 

0.350 

2.146 

3.171 

0.677 

2.475 

1.974 

(241 


(25) 


(26) 


mthi(ga,f)  =  0.128, 4(g,c,D  =  0.502.  and  f{ga,f)  =  E(ga<-f)  = 
0.00861.  This  is  a  case  of  a  close  ambiguity;  i.e..  the  object,  f. 
would  be  ambiguous  in  the  practical  sense  unless  the  data. 
Ifl,  were  low  in  noise.  The  ER  algorithm  was  run  close  to 
900  times  on  Ifl  using  a  nonnegativity  constraint,  each  time 
with  a  different  random  initial  start.  The  algorithm  con¬ 
verged  to  the  correct  solution  f  of  Eq.  (24)  only  lO'r  of  the 
time.  The  algorithm  stagnated  near  go  approximately  O'r  of 
the  time  and  at  several  images  close  tog.c  the  rest  of  the  time 
(81%).  When  a  combination  of  HIO  and  ER  was  used  with 
the  same  set  of  random  starts,  convergence  to  the  solution  / 
was  improved  to  a  26%  rate.  74%  of  the  time  the  algorithm 
stagnated  at  one  of  two  different  minima,  g.;  and  g,,.  each 
close  to  the  image  g.c  in  Eq.  (26): 

_^ro.353  2.143  3.172']  . 

*”'[0.684  2.470  1.976] 


35%  of  the  time,  with  4(g,i,  g.j.)  *  0.00195  and  E(g,i,  g„)  = 
0.(X)144,  and 


_  [0.266  1.876  2.971' 
*’^“[o.746  2.711  2.222. 


(28) 


39%  of  the  time,  with  4(g,j,  g.^)  *  0.0978  and  E(g,j,  g.^)  = 
0.0115.  The  imagesgsi  andg,-:  are  analogous  togmm  in  Fig.  6. 
While  convergence  to  g,\  is  bad  in  the  sense  that  g.i  is  differ¬ 
ent  from  the  solution  f  (4(g,i,  f)  *  0.502|,  it  is  still  consistent 
with  the  given  data  =  0.00848]  and  could  be  consid¬ 

ered  a  solution  (albeit  the  wrong  one).  The  stagnation  at  g,j 
is  even  more  troublesome  since  it  is  not  only  similarly  consis¬ 
tent  with  the  given  data  (E(gs2,  f)  =  0.00869]  and  far  from  f 
/)•  0.511]  but  also  is  not  so  close  to  g.c  |4(g, ..  g,c)  • 
0.0978]. 

A  complete  understanding  of  phase-retrieval  stagnation 
points  and  their  relationship  to  ambiguous  images  is  not  yet 
available.  However,  from  the  limited  number  of  experi¬ 
ments  of  the  type  described  above,  we  can  say  that  sugna- 
tion  points  are  often  related  to  ambiguous  images. 
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6.  NEAREST  AMBIGUITIES 

In  this  section  we  investigate  the  space  of  ambiguous  images 
in  order  to  gain  some  insight  into  just  how  close  the  nearest 
ambiguous  image  is  to  a  typical  image.  This  may  in  turn 
have  implications  about  how  nearest  ambiguities  relate  to 
stagnation  points  encountered  in  iterative  phase  retrieval. 
It  also  will  tell  js  the  probability  of  an  ambiguity  in  the 
practical  sense,  as  a  function  of  the  noise  in  the  Fourier - 
modulus  data. 

Object-Domain  Conditions  for  Ambiguity 

As  described  above,  ambiguous  images  are  characterized  in 
the  Fourier  domain  by  factorable  Fourier  transforms  and  in 
the  object  domain  by  being  expressible  as  the  convolution  of 
two  or  more  smaller  images.  U’e  choose  the  object -domain 
relationship  to  characterize  the  space  of  ambiguous  images. 
We  begin  by  deriving  the  ambiguity  condition  for  the  small¬ 
est  possible  2-D  ambiguous  image  (2X2  support)  and  then 
similarly  derive  it  for  a  3  x  2  support. 

2X2  Ambiguity  Conditions 

Consider  the  case  of  a  real-valued  image  on  a  2  X  2  support. 
It  is  ambiguous  if  it  can  be  expressed  as  the  convolution  of 
two  1-D  sequences: 


where  e,  /,  g,  and  h  are  all  nonzero  (for  simplicity  only  the 
nonzero  rows  and  columns  of  the  arrays  are  showm).  This 
gives  the  following  equations  for  a,  b,  c,  and  d; 


a  =  eg. 

(30a) 

b  =  eh. 

(30b) 

«■  =  fg. 

(30c) 

d^fh. 

(30d) 

Multiplying  Eq.  (30a)  with  Eq.  (30d)  and  Eq.  (30b)  with 
Eq.  (30c),  we  arrive  at  the  following  2X2  convolution  condi¬ 
tion: 

ad  -  be.  (31) 

In  this  case  a  single  ambiguous  counterpart  to  an  image 
satisfying  Eq.  (31)  is  generated  by  convolving  one  of  the  1-D 
sequences  by  the  flip  (rotation  by  180")  of  the  other  (equiva¬ 
lent  to  conjugating  the  corresponding  Fourier  factor).  How¬ 
ever,  if  e  =  /  and/or  g  =  h  (i,e.,  one  of  the  1-D  sequences  is 
symmetric),  then  flipping  the  factor  has  no  effect,  and  the 
image  is  still  unique.  Furthermore,  if  e  »  -/  and/or  g  «  —h, 
then  a  flip  of  either  convolution  factor  becomes  the  negative 
of  the  original  factor.  Since  we  do  not  consider  two  images 
that  differ  by  a  scalar  multiple  (—1  in  this  case)  as  ambigu¬ 
ous  counterparts,  we  must  also  rule  out  this  special  case  of 
negative  symmetric  factors.  Therefore  the  image  is  unique 
if  lei  «  1/1  or  if  Igl  »  Ihl.  From  Eqs.  (30)  we  see  that,  if  lol  »  Id 

or  |6|  ■  Idl,  then  lel  *  1/1,  and  if  lal  *  l6l  or  Id  ■  Idl,  then  Igl  • 

Ihl.  When  these  special  cases  are  combined  with  E)q.  (31), 
the  ambiguity  condition  for  the  case  of  2  x  2  support  be¬ 
comes 

ad  -  6c,  (32a) 

|6|  lol  ^  Id.  \32b) 


Note  that  the  inequalities  of  relation  (32b)  combined  with 
Eq.  (32a)  imply  that  |6I  ^  Idl  5<s  Id. 

Equation  (32a)  describes  a  three-dimensional  surface  in 
the  four-dimensional  space  of  real-valued  2x2  images. 
While  it  is  accepted  that  there  is  zero  probability  that  an 
arbitrarily  selected  object  will  land  on  this  surface,  i.e.,  the 
phase-retrieval  problem  is  almost  always  (with  probability 
1 )  unique,  in  this  paper  we  are  concerned  with  how  close  the 
Fourier  modulus  of  a  given  object  is  likely  to  be  to  the 
Fourier  moduli  of  images  lying  upon  this  surface. 

3X2  Ambiguity  Conditions 

The  same  approach  is  used  to  formulate  object -domain  am¬ 
biguity  conditions  for  3  x  2  images.  A  3  x  2  image  results 
from  convolving  either  (a)  a  3  X  1  sequence  with  a  1  x  2 
sequence  or  (b)  a  2  X  1  sequence  with  a  2  X  2  image.  Since  it 
is  known  that  any  1  -D  sequence  can  always  be  written  as  the 
convolution  of  smaller  sequences,  we  can  write  the  3x1 
sequence  of  case  (a)  as  the  convolution  of  two  2  x  1  se¬ 
quences.  We  can  then  combine  one  of  these  factors  with  the 
1X2  factor  to  give  case  (b).  Thus  we  need  only  consider 
case  (b),  and  our  3X2  image  is  ambiguous  if 

K :  ‘"t  i] 

-[fi  :;]•  - 

where  g  and  6  are  nonzeroandnoneofthepairs(iand/)or(i 
and  k)  or  (j  and  1)  or  (k  and  1)  is  zero.  This  gives  six 
nonlinear  equations  for  o,  6,  c,  d.  e,  and  /  in  terms  ofg.h.i.j. 
k,  and  1.  As  is  shown  in  Appendix  B.  these  equations  can  be 
solved  to  give  the  following  ambiguity  condition: 

(af  —  cd)^  —  (ae  —  bd){bf  —  ce)  =  0.  (34) 

Equation  (34)  describes  a  five-dimensional  surface  in  the  6- 
D  space  of  real-valued  3X2  images.  In  comparison,  for  the 
2X2  case  the  ambiguity  surface  describes  a  three-dimen¬ 
sional  surface  embedded  within  a  four-dimensional  space. 
Appendix  B  also  shows  that  Eq.  (34)  can  be  solved  to  give, 
for  example,  6  in  terms  of  the  remaining  five  values: 

An  ambiguous,  real-valued  3x2  image  arising  from  the 
convolution  of  a  2  X  1  sequence  with  a  nonfactorable  2x2 
image  can  be  shown  to  have  an  ambiguous  counterpart  that 
must  also  be  real  valued.  However,  if  the  2X2  convolution 
factor  of  Elq.  (33)  can  itself  be  factored,  then  we  have  the  case 
of  a  3  X  2  image  resulting  from  the  convolution  of  a  3  x  l 
sequence  with  a  1  X  2  sequence.  An  ambiguous,  real-valued 
image  formed  in  this  way  will  have  rows  that  are  scalar 
multiples  of  one  another;  i.e.,  a  *  Kd,  b  *  Ke,  and  c  =  Kf  for 
some  scalar  K.  This  condition  makes  each  difference  term 
in  Eq.  (34)  equal  to  zero.  It  is  straightforward  to  show  that  if 
6^  <  4ac,  then  this  real-valued  ambiguity  will  have  a  com¬ 
plex-valued  ambiguous  counterpart.  If  the  image  is  con¬ 
strained  to  be  real  valued,  then  this  complex-valued  image 
does  not  constitute  an  ambiguity  within  the  space  of  real¬ 
valued  images.  Furthermore,  because  this  special  case  is  a 
small  subset  of  the  entire  ambiguity  surface,  we  expect  it  to 
have  a  relatively  minor  effect  on  the  likelihood  of  stagnation 
due  to  nearby  ambiguities. 
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AMBIGUOUS 


Fig.  T.  Flow  chart  for  determining  the  ambiguity  of  the  3x2  real¬ 
valued  image  of  Eq.  (33).  Multiple  conditions  in  a  box  must  all  be 
satisfied  for  “YES."  except  where  "ar"  is  specified. 

The  ability  to  factor  an  image  into  the  convolution  of  two 
or  more  images  is  necessary,  but  not  sufficient,  for  determin¬ 
ing  ambiguity.  When  we  discussed  the  ambiguity  condition 
for  2  X  2  images,  we  considered  the  special  cases  of  what  we 
called  symmetric  and  negative-symmetric  convolution  fac¬ 
tors.  These  special  cases,  as  well  as  the  effect  of  zero-valued 
pixels,  also  must  be  considered  for  3  X  2  images.  To  save 
space,  rather  than  discussing  these  exceptions  in  detail  we 
summarize  them  in  the  ambiguity  flow  chart  in  Fig.  7. 

Nearest  Ambiguity  by  Means  of  Constrained 
Minimization 

The  mathematical  description  of  ambiguities  for  2  X  2  and  3 
X  2  images  can  be  used  to  investigate  the  nearness  of  a  given 
object  to  an  ambiguous  image.  We  formulate  the  task  of 
finding  the  ambiguous  image  nearest  a  given  object  as  a 
multidimensional  constrained-minimization  problem.  By 
nearest  ambiguity  we  mean  the  image  on  the  ambiguity 
surface  for  which  some  objective  function  involving  the  am¬ 
biguous  image  and  the  given  object  is  minimized.  For  the 
objective  function  we  choose 

^  [lG(u,  v)\  -  lF(u,  p)l]^  (36) 

U.l 

which  is  just  <-(g,  f)  of  Elq.  (9)  with  a/  ■  1  and  without  the 
normalization. 

Elach  MX  N  image,  having  MX  N  pixel  values,  can  be 
thought  of  as  a  single  point  in  an  L-dimensional  vector  space. 
To  emphasize  this  fact  we  can  denote  an  image  g  by  the  L- 
dimensional  vector  t,  where  t  *  (abc  d)'  for  the  2x2  case 
and  t  *  (a  b  c  d  e  f)'  (oT  the  3X2  case  (the  ordering  of  the 
pixels  in  the  vector  i  is  arbitrary).  Therefore,  for  a  given 
image  /,  we  desire  to  find  i  (or  g)  on  the  ambiguity  surface 
that  minimizes  E(i)  s  E{g,  f).  (Note  that  if  we  did  not 
constrain  i  to  be  on  the  ambiguity  surface,  then  we  would 
just  be  solving  the  phase- retrieval  problem!)  If  we  define 
the  ambiguity  surface  by  h(t)  *  0,  then  the  problem  of 
finding  the  nearest  ambiguity  to  /  can  be  stated  as  follows: 
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Given  an  object  /,  find  the  z  that  minimizes  the  objective 
function  Eit)  subject  to  the  ambiguity  condition  h(i)  =  0. 

The  two  image  supports  for  which  we  have  derived  ambi¬ 
guity  conditions  (Eqs.  (321  and  (34)j  give  rise  to  the  following 
h(i); 

2X2  Images  I L  =  4] 

hix)  =  ad  —  be  =  0,  (37) 

3X2  Images  (L  *  6| 

h(i)  -  (af  -  cd)'  -  (ae  -  bd)(bf  -  cel  =  0.  (38) 


Iterative  Constrained  Minimization 
Using  the  mathematical  framework  developed  above,  we 
now  implement  a  generalized  reduced-gradient  (sometimes 
referred  to  as  a  gradient-projection)  method  ”  to  find  the 
nearest  ambiguity  to  a  given  image.  This  method  is  ex¬ 
plained  in  detail  in  Appendix  C  and  is  summarized  below. 

In  an  unconstrained  gradient-search  method,  we  search 
for  a  minimum  to  the  objective  function  £(z)  in  the  direction 
of  —vE(z),  the  negative  gradient  of  that  function.  In  a 
constrained  search  we  still  would  like  to  follow  the  negative 
gradient,  but  we  are  constrained  to  move  along  a  particular 
surface  within  the  space,  described  by  the  equation  biz)  =  0. 
We  alter  the  search  direction  by  projecting  -vEd)  onto  a 
tangent  plane  of  h(i),  and  we  then  move  along  the  plane  in 
the  direction  of  the  projection,  p,  as  depicted  in  Fig.  8. 
Then,  from  a  point  along  p,  which  is  generally  not  on  the 
constraint  surface,  we  find  a  nearby  (not  necessarily  the 
closest)  point  on  the  constraint  surface.  The  method  used 
here  to  return  to  the  constraint  surface  is  detailed  in  Appen¬ 
dix  C.  The  search  for  the  solution  is  iterative,  and  we  define 
our  estimate  of  the  solution  after  the  kth  iteration  as  zi,.  At 
the  solution,  t„  -v£(i)  is  perpendicular  to  the  tangent 
plane  to  the  constraint  surface,  and  the  projection  onto  the 
tangent  plane  is  zero. 

It  is  difficult  to  determine  whether  the  minimum  found  is 
indeed  the  global  minimum  or  just  a  local  minimum.  In  a 
numerical  simulation  such  as  this,  one  can  gain  confidence  in 
claiming  a  minimum  as  global  only  through  repeated  search¬ 
ing  with  different  initial  guesses.  Our  practical  criteria  for 
claiming  that  a  minimum.  is  global  is  that  E(i,}  is  the 
smallest  among  all  minima  found  and  that  it  is  found  more 
than  twice  as  many  times  as  the  total  number  of  minima 


Fig.  8.  Gradient-projection  constrained-minimization  algorithm. 
The  search  direction  is  determined  by  projecting  the  negstivegrsdi- 
ent  of  the  objective  function  onto  the  tangent  plane  to  the  constraint 
surface. 
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found,  which  must  be  more  than  four.  If  the  above  criteria 
are  not  satisfied  after  40  different  minima  are  found,  then 
the  one  that  minimizes  £(  i » is  chosen  ( and  we  simply  realize 
that  it  may  not  be  the  global  minimum).  It  should  be  noted 
that  at  points  on  the  surface  where  vh(i)  =  0  the  tangent 
plane  is  not  defined.  If  such  a  singular  point  is  encountered 
the  search  may  terminate  without  satisfying  a  convergence 
criterion,  but  the  estimate  at  the  singular  point  may  still 
minimize  the  objective  function  over  all  other  estimates  (see 
Appendix  C). 

Although  the  constrained-minimization  algorithm  mini¬ 
mizes  an  objective  function  defined  in  Fourier-modulus 
space,  the  search  itself  takes  place  on  surfaces  in  object 
space.  The  minima  found  on  the  surface  of  Eq.  (37)  will 
always  correspond  to  images  with  two  convolution  factors, 
and  that  usually  will  be  the  case  for  the  minima  found  on  the 
surface  of  Eq.  (38)  as  well.  Thus  the  nearest  ambiguity  in 
Fourier-modulus  space  to  an  object  /  corresponds  in  object 
space  to  any  of  four  images  (not  counting  scalar  multiples  of 
these  images):  the  ambiguity,  its  ambiguous  counterpart, 
and  the  twin  image  of  each.  So,  once  we  have  an  estimate  of 
the  global  minimum  with  respect  to  Fourier-domain  error 
[Eq.  (36)],  denoted  by  gi,  we  calculate  the  object -domain 
error  6  for  g|  and  its  twin  image,  retaining  the  smaller  of  the 
two  values.  We  then  find  the  ambiguous  counterpart  to  g\, 
denoted  by  gi^,  by  convolving  one  of  the  factors  of  g|  with  the 
twin  of  the  other.  After  finding  the  smaller  i  for  gy  and  its 
twin,  we  keep  as  the  worst-case  nearest  ambiguity  the  larger 
of  this  6  and  the  one  retained  for  gi  and  its  twin.  Referring 
back  to  Fig.  1,  the  smaller  value  of  6  corresponds  to  the 
nearest  ambiguity  in  the  object  domain,  g„,  and  the  larger 
retained  value  of  6  corresponds  to  its  ambiguous  counter¬ 
part,  gac.  the  worst-case  nearest  ambiguity.  Although  ga  and 
g,c  are  both  nearest  ambiguities  to  /  with  respect  to  Fourier- 
domain  error,  we  differentiate  them  by  defining  the  worst- 
case  nearest  ambiguity  as  the  one  with  the  larger  value  of  the 
object-domain  error,  i.  with  respect  to  /.  The  worse-case 
nearest  ambiguity  corresponds  to  the  point  in  object  space 
farthest  from  the  true  image  that  either  is  likely  to  cause 
local  minima  to  trap  phase-retrieval  algorithms  or  could  be 
confused  with  the  true  image  if  the  squared  error  in  the  data 
exceeds  £(x). 

Monte  Carlo  Simulations 

To  investigate  the  prevalence  of  ambiguities  we  implement¬ 
ed  the  constrained -minimization  nearest-ambiguity  search 
in  a  Monte  Carlo  simulation  in  which  nearest  ambiguous 
images  were  found  for  a  large  number  of  random  objects  fix. 
y ).  Each  pixel  of  the  object  was  an  independent,  real-valued 
random  number  uniformly  distributed  on  the  interval  [-2, 
2]  or  |0, 4]  for  nonnegative  objects.  The  results  of  the  Monte 
Carlo  simulations  are  presented  in  the  form  of  scatter  plots 
of « versus  6  for  the  worst-case  nearest  ambiguity.  For  each 
random  object  /,  the  value  of  <  for  the  nearest  ambiguity  is 
plotted  versus  the  worst-case  i.  The  interpretation  of  these 
scatter  plots  should  not  be  confused  with  that  of  the  grid- 
search  scatter  plots  shown  above.  Recall  that  all  the  (j,  c) 
pairs  in  a  grid -search  scatter  plot  are  calculated  by  using  a 
single  object  /  and  have  nothing  to  do  with  ambiguities,  while 
each  (4, «)  point  in  Monte  Carlo  scatter  plot  represents  met¬ 
rics  for  the  worst-case  nearest  ambiguity  to  a  different  ran¬ 
dom  object  /.  We  computed  these  plots  for  five  separate 


cases:  (1)2X2  objects  without  a  nonnegativity  constraint 
on/.  (2)  2  X  2  objects  with  a  nonnegativity  constraint,  (3)  3  x 
2  objects  without  a  nonnegativity  constraint.  (4)3x2  ob¬ 
jects  with  a  nonnegativity  constraint,  and  (5)  L-shaped  (with 
6  =  c  =  0)3x2  objects  with  a  nonnegativity  constraint.  The 
five  cases  above  represent  different  constraints  on  /.  The 
only  constraint  on  the  worst -case  nearest  ambiguity,  is 
that  it  lie  upon  the  ambiguity  surface  corresponding  to  the 
support  of  /. 

A  typical  scatter  plot  of  ~4000  points  required  ~1 10  h  for 
the  2x2  objects  and  ~1500  h  for  the  3X2  objects  on  an  IBM 
AT  personal  computer. 

The  scatter  plots  of  f  versus  4  for  the  2x2  support  cases 
( 1 )  and  (2)  are  shown  in  Fig.  9.  The  points  that  would  cause 
trouble  are  those  that  have  small  Fourier-modulus  error 
(FME),  (,  and  significantly  larger  object-domain  error 
(ODE),  4.  These  troublesome  points  are  likely  to  induce 
phase-retrieval  algorithm  stagnation  and/or  are  ambiguous 
from  a  practical  point  of  view  when  the  Fourier-modulus 
data  are  sufficiently  noisy.  One  definition  of  a  trouble  re- 
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Fig.  9.  Fourier-modulus  error  «  versus  object-domain  error  6  for 
worst-case  nearest  ambiguities  to  2  x  2  objects,  (a)  No  nonnegati¬ 
vity  constraint,  4752  objects:  (b)  nonnegativity  constraint.  4486 
objecta. 
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Fig.  10.  Monte  Carlo  estimates  of  the  probability  that  the  worst- 
case  nearest  ambiguity  to  2  x  2  objects  with  and  without  a  nonnega¬ 
tivity  constraint  has  a  Fourier-modulus  error  less  than  «  and  an 
object-domain  error  greater  than  Ki  (K  =  -1  and  A'  =  101. 


and  (41)  reveals  the  opposite  trend.  Figure  11  shows  the  < 
versus  6  scatter  plots  for  the  nearest  3x2  ambiguities  with 
and  without  a  nonnegativity  constraint  on  f.  With  no  non¬ 
negativity  constraint.  the  scatter  plot  of  Fig.  Ilia)  is  uniform 
in  appearance,  indicating  a  greater  likelihood  of  nearby  am¬ 
biguities  in  the  trouble  regions.  With  the  nonnegativity 
constraint.  Fig.  11(b)  shows  a  high  concentration  of  points  in 
the  large  «.  large  6  region  of  the  plot,  away  from  the  trouble 
region.  It  is  the  nonnegativity  constraint  that  creates  the 
favorable  banding  effect  for  the  3x2  case.  Integrating 
these  plots  below  the  K  =  4  and  K  =  10  lines  yields  the 
probability  distributions  of  Fig.  12.  In  comparison  with  the 
example  given  for  the  2X2  nonnegative  case,  the  probability 
of  finding  a  worst-case  nearest  ambiguity  with  FME  <  <  0.04 
and  ODE  5  >  0.16  is  increased  to  17%  without  nonnegativity 
but  reduced  by  approximately  one  half  to  9^c  with  the  non¬ 
negativity  constraint  on  /. 

One  possible  reason  that  nonnegativity  reduces  the  proba¬ 
bility  of  significant  ambiguity  for  the  3X2  case  is  as  follows. 
From  Eq.  (35)  we  see  that  there  are  no  real-valued  ambigu- 


» 


gion  is  all  the  points  below  the  line  5  =  K<.  shown  in  Fig.  9  for 
A'  =  4  and  K  -  10.  That  is,  we  do  not  consider  the  practical 
ambiguity  problem  to  be  significant  unless  the  error,  5,  in  the 
ambiguous  reconstruction  or  stagnation  point  exceeds  4 
times  (or  10  times)  the  error  in  the  Fourier-modulus  data. 
Only  then  would  we  consider  the  ambiguity  to  be  significant. 
(Although  it  was  easy  to  show  in  Appendix  A  that  S  >  t  for 
any  pair  of  images,  an  analgous  relationship  for  an  image  and 
its  worst-case  nearest  ambiguity  has  not  been  developed.) 
Figure  9(a)  (no  nonnegativity  constraint  on  /)  exhibits  a 
banded  structure  with  a  higher  density  of  points  above  the  S 
=  4(  line,  which  effectively  reduces  the  probability  of  nearest 
ambiguities  in  the  trouble  region.  Figure  9(b)  (nonnegati¬ 
vity  constraint  on  /)  reveals  a  higher  density  of  points  in  the 
trouble  region,  particularly  for  4  <0.5.  Thus  the  nonnegati¬ 
vity  constraint  on  f  actually  increases  the  probability  that  a 
random  object's  Fourier  modulus  is  close  to  that  of  an  am¬ 
biguous  image  for  the  2X2  case. 

One  way  to  estimate  the  probability  of  significant  ambigu¬ 
ity  is  to  integrate  these  scatter  plots  in  the  trouble  region 
below  the  line  4  »  Kf.  If  we  bin  the  points  below  this  line 
with  respect  to  t,  we  can  obtain  an  estimate  of  the  probabili¬ 
ty-density  function  of  the  probability  that  the  worst-case 
nearest  ambiguity  has  FME  e  and  4  >  Kf.  Integrating  this 
estimated  probability-density  function  from  0  to  <  yields  an 
estimate  of  the  probability  that  the  worst-case  nearest  ambi¬ 
guity  to  an  arbitrary  object  has  less  than  t  FME  and  ODE  4  > 
Kf .  These  cumulative  probability  distributions  define  what 
we  mean  by  the  probability  of  significant  ambiguity.  These 
distributions  for  cases  (1)  and  (2)  are  shown  in  Fig.  10  for  K 

4  and  K  »  10.  Figure  10  verifies  our  previous  observation 
that  the  nonnegativity  constraint  actually  improves  the 
chance  of  significant  ambiguity.  For  example,  these  esti¬ 
mated  distributions  tell  us  that,  given  an  arbitrary,  real¬ 
valued  2X2  object,  the  probability  of  finding  a  worst-case 
nearest  ambiguity  with  FME  <  <  0.04  and  ODE  4  >  0.16  is 
10%  for  f  without  nonnegativity  and  18%  for  /  with  nonnega¬ 
tivity. 

The  same  analysis  for  the  3X2  object  support  [cases  (3) 
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Fig.  11.  Fourier-modulus  error  «  versus  object-domain  error  6  for 
worst-case  nearest  ambiguities  to  3  x  2  objecu.  (a)  No  nonnegati¬ 
vity  constraint,  4112  objects;  (b)  nonnegativity  constraint.  4601 
objecu. 
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Fig.  12.  Monte  Carlo  estimates  of  the  probability  that  the  worst- 
case  nearest  ambiguity  to  3  X  2  objects  with  and  without  a  nonnega¬ 
tivity  constraint  has  a  Fourier-modulus  error  less  than  <  and  an 
object-domain  error  greater  than  Kt  (K  =  4  and  K  =  10). 

ous  images  for  which  e-  —  4df  <  0.  Since  -4df  is  negative  for 
positive  d  and  /,  but  is  positive  if  one  of  them  is  negative,  e-  — 
4df  is  more  often  negative  for  nonnegative  images.  Thus 
nonnegative  objects  are  less  likely  to  have  nearest  ambigu¬ 
ities  that  are  nearby  (in  the  object  domain)  than  are  objects 
without  a  nonnegativity  constraint.  Since  objects  that  are 
similar  in  the  object  domain  will  tend  to  be  similar  in  the 
Fourier-modulus  domain,  the  nearest  ambiguities  to  non¬ 
negative  objects  are  less  likely  to  be  nearby  with  respect  to 
Fourier  modulus  as  well. 

An  important  point  that  should  be  stressed  is  that  the 
nonnegativity  constraint  discussed  in  this  section  is  on  the 
object  /  and  not  on  the  nearest  ambiguity.  Because  of  this 
fact,  the  nearest  ambiguous  image  to  a  nonnegative  object 
might  not  be  nonnegative  itself;  it  could  contain  one  or  two 
negative-valued  pixels.  Thus  a  nonnegativity  constraint  in 
a  phase-retrieval  algorithm  may  help  to  move  the  image 
away  from  a  stagnation  point  near  the  ambiguity,  and  the 
probability  of  ambiguity  in  the  practical  sense  would  be 
reduced  compared  with  the  results  shown  here. 

At  this  point  it  is  useful  to  recall  the  conjecture  made  in 
Section  5.  i.e..  that  stagnation  points  of  the  iterative  Fourier- 
transform  algorithm  tend  to  be  near  ambiguous  images  that 
have  Fourier  moduli  close  to  the  given  Fourier  modulus,  Ifl. 
The  example  given  in  Section  5  used  an  object  f  and  its 
nearest  ambiguity  (Eqs.  (24)-(26))  taken  from  the  Monte 
Carlo  experiment  with  3X2  nonnegative  objects.  Recall 
that,  for  the  object  f  of  Eq.  (24).  after  numerous  trials  we 
found  two  stagnation  points,  g.,  and  g,  j.  of  both  the  HIO  and 
ER  versions  of  the  iterative  Fourier-transform  algorithm. 
The  closeness  in  both  domains  of  these  stagnation  points  to 
the  worst-case  nearest  ambiguity,  g,c  (Eq.  (26)],  was  shown. 
A  few  more  simulations  of  this  type  were  performed  for 
different  nonnegative  3X2  objects.  Objects  were  selected 
based  on  the  locations  in  Fig.  11(b)  of  their  worst-case  near- 
est-ambiguity  error  metrics.  All  objects  selected  had  a 
worst-case  nearest  ambiguity  with  0.45  <  4  <  0.55.  Three 
objects  with  (significant)  worst-case  nearest  ambiguities 
with  <  <  0.05  [as  was  the  case  for  /of  Eq.  (24)]  were  selected. 


and.  compared  wi*h  the  26‘'f  success  rate  for  /  with  HIO.  the 
true  solution  was  found  48^^.  49^1.  and  SO'c  of  the  time, 
respectively,  by  using  HIO  on  these  three  objects.  .As  with  /. 
when  the  true  solution  was  not  found,  the  algorithm  stagnat¬ 
ed  near  the  worst-case  nearest  ambiguity  )  to  each  of  the 
three  objects.  Two  objects  with  a  worst-case  nearest  ambi¬ 
guity  with  <  =!  0.10  converged  to  the  true  solution  ~S'^c  and 
lOO")!  of  the  time,  and  another  object  with  a  worst-case  near¬ 
est  ambiguity  with  <  =  0.30  converged  to  the  solution  100‘'c  of 
the  time.  Thus  stagnation  tends  to  decrease  as  the  nearest 
ambiguities  move  farther  away  with  respect  to  f  (equivalent¬ 
ly,  as  the  significance  of  ambiguity  decreases).  .As  men¬ 
tioned  above,  the  limited  number  of  experiments  of  this  type 
has  not  yet  provided  us  with  a  complete  understanding  of 
phase-retrieval  stagnation  points  and  their  relationship  to 
worst-case  nearest  ambiguous  images.  Nevertheless,  the 
correlation  of  the  object's  worst-case  nearest  ambiguity  hav¬ 
ing  large  4  and  small  <  (f  <  0.05  for  our  experiments)  with  the 
presence  of  stagnation  points  has  been  convincingly  estab¬ 
lished. 

The  final  case  investigated  is  nonnegative.  3x2  objects 
with  6  =  c  =  0.  which  we  call  L-shaped  objects.  The  L- 
shaped  support  itself  mandates  uniqueness;  i.e..  it  is  not 
possible  to  convolve  two  nontrivial  functions  to  obtain  an 
image  with  this  support.  After  running  the  Monte  Carlo 
simulation  for  these  objects,  we  discovered  a  class  of  L- 
shaped  ambiguities  that  gives  rise  to  misleading  results. 
Consider  the  object 
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with4(g|„/j  =  7.015E-4  and  f(ga./*  =  4.167E-4.  The  ambig¬ 
uous  counterpart  to  go.  obtained  by  flipping  the  first  convo¬ 
lution  factor  in  Eq.  (40).  is 


g„  =  (0.045354 
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2.01419 
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3.88340 
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0.08769  2.18326  3.88340, 


(41) 


The  object-domain  error  between  /  and  g,^  as  defined  by  Eq. 
(11)  is  4(g,c,  f)  =  1.0629.  However,  comparison  of  /  and  gac 
reveals  that  the  image  g,^  is  similar  to  the  image  /  shifted  by 
one  pixel  to  the  right.  This  is  because  the  first  convolution 
factor  of  Eq.  (41)  is  nearly  a  delta  function,  and  the  second 
factor  is  very  similar  to  the  image  /  without  its  right-hand 
column.  The  first  convolution  factor  causes  a  tapering  of 
the  image,  making  one  column  much  smaller  in  value  than 
the  other  nonzero  pixels.  Flipping  one  of  the  convolution 
factors  simply  shifts  the  significant  pixels  and  moves  the 
tapered  column  to  the  other  side  of  the  image.  Because  the 
object-domain  error  metric  4  does  not  take  such  shifts  into 
account,  the  value  of  4(g.c,  /)  calculated  for  this  case  is  much 
too  large,  resulting  in  a  misleading  point  on  the  scatter  plot. 
(If  the  calculations  were  to  be  redone,  then  this  problem 
could  be  accounted  for  by  cross  correlating  g.^  with  /  and 
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shifting  g,c  according  to  the  cross-correlation  peak  to  mini¬ 
mize  i.) 

A  similar  problem  may  occur  if  the  shorter  leg  of  the  L- 
shaped  support  is  tapered,  leading  to  nearest  ambiguities 
that  are  close  to  1-D  sequences.  To  reduce  the  misleading 
effects  of  tapered  images  on  our  analysis,  we  consider  only 
those  images  that  satisfy  a  bound  on  the  robustness  of  the  L 
shape.  An  L-shaped  image  1°  J'l  has  L  robustness  /{%, 
defined  by 

-  minlo,  f\/\{a-  -F  d-  +  e'  +  /-)/4l''’.  (42) 

IvK/ 

Images  with  large  R  are  robustly  L  shaped,  whereas  images 
with  small  R  (strongly  tapered)  are  only  weakly  L  shaped. 

It  should  be  noted  that  the  same  taper  problem  can  also 
cause  misleading  ODE  calculations  of  worst-case  nearest 
ambiguities  for  the  2x2  and  3X2  images  in  cases  (l)-(4). 
In  these  cases,  whole  rows  or  columns  would  have  to  be 
significantly  smaller  than  the  rms  pixel  intensity  of  the  im¬ 
age.  Since  the  images  are  random,  it  is  much  less  likely  for 
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Fig.  13.  Fourier-modulua  error  t  venue  object-domain  error  6  for 
worst-case  nearest  ambiguities  to  3  X  2.  nonnegative.  L-shaped 
objects,  (a)  L  robustness  >  10%.  3190  objects;  (b)  L  robustness  > 
25%.  2714  objecu. 
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Fig.  14.  Monte  Carlo  estimates  of  the  probability  that  the  worst - 
case  nearest  ambiguity  to  3  x  2,  nonnegative.  L-shaped  objects  with 
L  robustness  greater  than  R%  {R  «  10  and  R  =  25)  has  a  Fourier- 
modulus  error  less  than  t  and  an  object-domain  error  greater  than 
K«  (K  -  4  and  A  «  10). 

this  to  occur  in  cases  (l)-(4)  for  which  two  or  more  pixels 
must  be  small  simultaneously  than  for  the  L-shaped  case  ( 5) 
for  which  only  a  single  pixel  must  be  small. 

The  worst-case  nearest-ambiguity  scatter  plots  for  non- 
negative,  L-shaped  images  with  L  robustness  greater  than 
10%  and  25%  are  shown  in  Fig.  13.  As  the  L-robustness 
requirement  is  increased,  many  points  clustered  about  the  6 
axis  disappear.  (Had  we  been  able  to  calculate  i  with  image 
shifts  taken  into  account,  we  would  have  found  these  points 
moving  horizontally  into  the  small  i,  small  f  region  of  the 
plot.)  Despite  the  nonnegativity  of  f,  these  scatter  plots  are 
less  banded  than  for  general  3X2  nonnegative  objects — case 
(4)  in  Fig.  11(b).  This  is  verified  by  the  estimated  distribu¬ 
tions  for  both  taper  percentages  (Fig.  14).  For  the  case  of  L 
robustness  greater  than  25%,  the  distributions  of  Fig.  14 
achieve  a  lower  probability  than  does  case  (4)  for  values  of  f 
less  than  0.07,  reflected  by  the  small  number  of  points  near 
the  origin  of  the  plots  in  Fig.  13.  Therefore,  for  the  low- 
noise  case,  the  L-shaped  support  constraint  not  only  pre¬ 
vents  ambiguity  in  the  absolute  sense  but  it  also  makes 
ambiguity  less  likely  in  the  practical  sense. 

7.  SUMMARY  AND  CONCLUSIONS 

An  ambiguous  image  is  one  whose  Fourier  modulus  is  identi¬ 
cal  to  the  Fourier  modulus  of  a  second  image  that  is  other 
than  a  scaled  version,  a  translation,  or  a  twin  of  the  image. 
Arbitrary  objects  are  almost  never  (i.e.,  with  probability- 
zero)  ambiguous.  Nevertheless,  the  existence  of  an  ambigu¬ 
ous  image  close  to  a  given  object  has  two  harmful  effects;  it 
causes  stagnation  points  for  phase-retrieval  algorithms  and, 
for  the  case  of  noisy  Fourier-modulus  data,  it  may  cause  the 
solution  to  be  ambiguous  in  the  practical  sense.  Because  of 
the  nonlinearity  of  the  phase-retrieval  problem,  these  issues 
are  difficult  to  characterize  analytically.  We  investigated 
the  prevalence  of  ambiguous  images  for  the  phase-retrieval 
problem,  using  numerical  approaches.  This  is  practical  be¬ 
cause  we  considered  the  case  of  small  objects  defined  on  2  x 
2  and  3X2  supports. 
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Using  both  a  new  iterative  grid -search  algorithm  and  the 
iterative  Fourier-transform  algorithm,  multiple  phase-re¬ 
trieval  experiments  were  performed,  and  stagnation  points 
were  found  that  correspond  to  local  minima  in  the  Fourier- 
domain  error  metric.  These  stagnation  points  were  shown 
to  be  close  to  ambiguous  images  whose  Fourier  moduli  are 
close  to  the  modulus  of  the  Fourier  transform  of  the  object. 
The  implication  is  that  the  existence  of  the  ambiguous  im¬ 
ages  causes  the  local  minima  to  occur.  However,  the  precise 
relationship  between  the  local  minima  and  the  ambiguous 
images  is  not  yet  understood,  and  nearest  ambiguities  may 
not  be  the  sole  cause  of  stagnation. 

The  prevalence  of  ambiguities  close  (with  respect  to  Fou¬ 
rier  modulus)  to  a  given  object  was  explored  by  a  Monte 
Carlo  experiment  in  which  nearest  ambiguities  were  found. 
First,  object-domain  analytic  expressions  for  the  set  of  am¬ 
biguous  images  were  derived  for  both  the  2X2  and  3x2 
supports  [Eqs.  (37)  and  (38)).  For  the  2X2  case,  the  set  of 
ambiguous  images  forms  a  three-dimensional  surface  em¬ 
bedded  in  the  four-dimensional  space  of  2  X  2  real-valued 
images.  For  the  3x2  case,  the  set  of  ambiguous  images 
forms  a  five-dimensional  surface  embedded  in  the  6-D  space 
of  3  X  2  real-valued  images.  Next,  a  reduced-gradient 
search  technique  was  used  to  search  along  the  surfaces  of 
ambiguous  images  to  find  the  ambiguous  image  nearest  a 
given  object  with  respect  to  Fourier  modulus.  Of  the  nearest- 
ambiguity  pair  of  images,  one  is  usually  close  to  the  object  /, 
while  its  am  biguous  counterpart  is  usually  a  worse  case;  it  is 
much  farther  from  the  given  object,  yet  it  has  a  Fourier 
modulus  identical  to  the  ambiguous  image  that  is  close  to  f. 
Histograms  of  Fourier-modulus-domain  versus  object-do¬ 
main  errors  were  accumulated  in  Monte  Carlo  experiments 
involving  numerous  random  objects  and  their  worst-case 
nearest  ambiguities.  Integration  of  the  histograms,  over  the 
points  for  which  the  object-domain  error  is  large  relative  to 
the  Fourier-modulus  error,  yielded  estimates  of  the  proba¬ 
bility  that  a  significant  ambiguity  would  occur  within  a  given 
Fourier-modulus  error  tolerance.  It  was  found  that  nonneg¬ 
ativity  of  the  object  decreased  the  probability  of  significant 
ambiguity  for  the  3X2  case  (as  anticipated)  but  increased 
the  probability  of  significant  ambiguity  for  the  2X2  case. 
However,  since  the  ambiguous  images  were  allowed  to  have 
negative  values  even  when  the  objects  were  restricted  to  be 
nonnegative,  it  is  likely  that  the  imposition  of  a  nonnegativ¬ 
ity  constraint  in  a  phase-retrieval  algorithm  would  help  to 
avoid  some  of  those  ambiguities.  L-shaped  images,  whose 
support  guarantees  uniqueness  in  the  absolute  sense,  were 
also  investigated.  It  was  found  that,  for  low-noise  data,  the 
L-shaped  support  of  the  object  also  makes  ambiguity  less 
likely  in  the  practical  sense. 

Future  work  should  include  the  application  of  this  ap¬ 
proach  to  objects  with  larger  supports.  This  is  important 
since  it  is  difficult  to  extrapolate  from  these  results  for  2  X  2 
and  3x2  supports  to  the  case  of  most  interest;  supports 
with  many  pixels  in  each  dimension.  The  probability  of 
significant  ambiguity  for  the  3  X  2  case  was  of  similar  magni¬ 
tude  to  that  of  the  2X2  case.  This  is  probably  because  the 
ambiguity  surfaces  in  both  cases  were  of  dimension  one  less 
than  the  dimension  of  the  space  of  objects.  When  larger 
objects  are  considered,  however,  this  changes.  For  example, 
for  3  X  3  objects 


a  b  c 
d  e  f  • 
g  h 

factoring  into  a  (3  X  2)  convolved  with  a  (1  X  2),  the  ambigu¬ 
ity  condition  is  given  by  the  simultaneous  equations 

(ah  -  bg)-  ~  (ae  -  bd)(dh  -  eg)  =  0  (43) 

and 

(ah  -  bg)(af  —  cd)  -  (ae  -  bd)(ai  -  eg)  =  0.  (44) 

These  describe  two  eight-dimensional  surfaces  embedded  in 
a  nine-dimensional  space  of  3  x  3  real-valued  objects,  the 
intersection  of  which  would  ordinarily  be  expected  to  be  a 
seven-dimensional  surface  embedded  in  the  nine-dimen¬ 
sional  space.  The  ambiguity  condition  for  the  factoring  of  a 
3X3  object  into  a  (2  X  2)  convolved  with  another  (2  x  2)  is 
also  given  by  a  pair  of  simultaneous  equations  describing  two 
eight-dimensional  surfaces  embedded  in  a  nine-dimensional 
space,  the  intersection  of  which  would  ordinarily  be  a  seven- 
dimensional  surface  in  Che  nine-dimensional  space.  Thus 
for  these  larger  images  the  dimensionality  of  the  surface  of 
ambiguous  images  is  smaller  relative  to  the  space  of  ail  ob¬ 
jects  than  for  the  2  x  2  or  3  X  2  case;  consequently  one  would 
expect  the  probability  of  significant  ambiguity  to  be  less  for 
these  larger  images.  The  importance  of  the  shape  of  the 
support  constraint  (convex  versus  nonconvex  versus  sepa¬ 
rated  parts,  etc.)  may  also  reveal  itself  more  forcefully  for 
larger  supports.  Finally,  a  better  understanding  of  the  pre¬ 
cise  relationship  between  local  minima  and  nearest  ambigu¬ 
ous  images  could  lead  to  methods  for  avoiding  phase-retriev¬ 
al  algorithm  stagnation  at  local  minima. 


APPENDIX  A:  PROOF  THAT  *  <  5 

By  definition,  lad  =  af,  or  ao  =  ±o/.  The  proof  that  e(g,  f)  < 
i{g,  f)  can  be  given  by  using  Parseval’s  theorem  with  the 
definition  of  i(g,  f)\ 


HgJ)  “  >  loTo*!^-  y)  ~  A*,  y))' 


l±a^(u,  o)  —  F(u, 


I-’/V^V.y)  " 

v)\-  I'S'  If(u,  t,')l' 

«4.l' 


By  the  triangle  inequality,  given  two  vectors.  oi  and  03,  k'l  — 
02!^  i  llvil  —  It'al)*.  Therefore 

l±a/G(u,  u)  -  Fiu,  o)P  >  [odG(u,  o)i  -  lF(u,  o)]]-.  (A2) 

Inserting  inequality  (A2)  into  Eq.  (Al).  we  have 

8(g,f)  >  V  [a^G(u.  0)1  -  lF(u,  c)l]=/V  lf(u,  o)l- 

L 

=  (A3) 
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APPENDIX  B:  DERIVATION  OF  3  x  2 
AMBIGUITY  CONDITION 


Equation  (33)  gives  us  the  following  six  equations; 

a  =  gt,  (Blal 

b  =  hi  +  g;.  (Blbi 

c  =  h],  (Blct 

d  =  gk.  (Bldl 

e  =  hk  +  gl.  (Ble) 

iBlfi 

Multiplying  Eqs.  (Bla)  and  (Blfi  gives 

af  =  ghit.  (B2) 

and  multiplying  Eqs.  (Bid  and  (Bid)  gives 

cd  =  ghjk.  (B3) 

Combining  these  yields 

,jf  -  cd)- =  g-h-{il  -  ]k)-.  (B4) 

From  Eqs.  (Bib),  (Bid,  (Ble),  and  (Blf)  we  have 

(6/  -  ce)  *  h-iil  -  jk),  (B5) 

and  from  Eqs  (Bla),  (Bib),  (Bid),  and  (Ble)  we  have 

(ae  -  bd)  *  g-(it  ~  jk).  (B6) 

Taking  the  product  of  Eqs.  (B5)  and  (B6)  yields 

(ae  -  bd){bf  -  ce)  -  g-hHil  -  jk)-.  (B7) 

From  Eqs.  (B4)  and  (B7)  we  arrive  at  the  result 

(af  -  cd)-  -  (ae  -  bd)(bf  -  ce)  •»  0.  (B8) 


This  equation  is  the  condition  that  must  be  met  in  order  for 
the  3x2  image  of  Cq.  (33)  to  be  ambiguous. 

From  Eq.  (B8)  we  can  solve  for  any  of  the  six  variables  in 
terms  of  the  other  five.  For  example,  by  expanding  and 
collecting  powers  of  b,  we  arrive  at 

b-(df)  —  biaef  +  cde)  +  ace’  +  (af  —  cd)-  =  0.  (B9) 


The  solution  of  Eq.  (B9),  wnich  is  quadratic  in  b.  is  given  by 


6  = 


{e(a/  cd)  ±  (e-  —  4dP'  -  (af  -  cd)| 
2df 


APPENDIX  C:  GENERALIZED  REDUCED- 
GRADIENT  METHOD 

The  generalized  reduced-gradient  method  is  s  gradient-pro¬ 
jection  technique  used  to  apply  a  set  of  constraints  to  a 
minimization  problem.  The  application  discussed  here  uses 
a  single  nonlinear  homogeneous  constraint,  h(i)  «  0,  and  the 
discussion  is  presented  with  this  assumption.  We  begin  by 
defining  the  tangent  plane  to  a  surface: 

Given  a  point  i*  satisfying  h(i*)  *  0,  the  tangent  plane  T 
at  that  point  is  T  ■  |y:  Vh(i*)  •  y  *  0),  where  V  denotes  the 
gradient  with  respect  to  t  and  •  denotes  the  dot  product. 


Simply  stated,  all  vectors  y  in  the  tangent  plane  T  are  per¬ 
pendicular  to  the  gradient  of  h(z)  at  x*. 

In  an  unconstrained  gradient -search  method,  we  would 
search  for  a  minimum  to  the  objective  function  £ii  i  in  the 
direction  of  the  negative  gradient  of  that  function.  —  v£ixi. 
In  a  constrained  search,  however,  the  solution  is  constrained 
to  a  particular  surface  within  the  space,  and  we  must  alter 
the  direction  of  the  search  to  remain  on  the  surface.  We  do 
this  by  projecting  ~vE(i)  onto  a  tangent  plane  of  hizi  and 
moving  along  the  plane  in  the  direction  of  the  projection,  p. 
Because  points  lying  along  p  in  general  will  not  lie  upon  the 
constraint  surface,  the  goal  is  to  move  along  p  and  then  to 
return  to  the  suria.-e  h(i)  =  0  such  that  there  is  a  sufficient 
decrease  in  the  objective  function.  More  will  be  said  below 
about  how  to  return  to  the  surface  from  the  projection  onto 
the  tangent  plane. 

The  solution  point,  x,.  satisfies  the  following  f-st-order 
condition: 


Ally  satisfying  Vh(x,)  -  y  =  0  (in  the  tangent  plane  at  x.) 
must  also  satisfy  —vE(zJ  •  y  =  0. 


The  above  definition  implies  that,  at  the  solution.  -v£  is 
parallel  to  Vh.  which  in  turn  implies  that  the  projection  p  is 
zero.  Note  that  the  above  definition  applies  to  any  mini¬ 
mum  and  not  just  to  the  global  minimum. 

The  search  is  iterative,  and  we  define  x»  as  our  estimate  of 
the  solution  after  k  iterations.  The  goal  is  to  find  x,.  i  such 
that  Eli*)  significantly  decreases  at  each  iteration  and  to 
continue  iterating  until  the  first-order  condition  above  is 
satisfied  with  a  sufficient  degree  of  confidence. 

We  now  discuss  the  reduced-gradient  method  in  more 
specific  terms  for  the  case  of  a  single  homogeneous  con¬ 
straint.  Let  us  assume  we  are  working  in  an  L -dimensional 
space.  A  tanger .  plane  to  h  ( x )  can  be  thought  of  as  a  surface 
of  dimension  one  less  than  the  space  in  which  it  lies.  In 
order  to  use  projection  ideas  from  linear  algebra,  we  define 
the  tangent  plane  as  a  space  spanned  by  a  set  of  basis  vec 
tors. 

A  vector  that  is  perpendicular  to  the  tangent  plane  to  hix ) 
at  a  point  i  »  (xi  ii . . .  ij,)'  is 


vh(i) 


(ilL  i'L 

ydx,  dx. 


(Cl) 


A  set  of  L  —  1  linearly  independent  L-dimensional  basis 
vectors  that  span  the  space  perpendicular  t''  ^h(x)  (i.e..  the 
tangent  plane)  is  (assuming  that  dh/dx,  *  0) 


‘-Hsr(f) » ■ » »]■ 

The  set  of  basis  vectors  defined  in  Eqs.  (C2)  enables  us  to 
define  a  projection  onto  the  tangent  plane  to  /i(x ).  If  we  let 
the  5’s  be  the  columns  of  an  £  x  (L  -  l )  matrix. 
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2=16,  6.....6t-,).  (C3) 

then  the  projection  of  an  arbitrary  L -dimensional  vector, 
onto  the  space  spanned  by  the  columns  of  2  is  “ 

p  =  2(2'2r'2i.  (C4) 

From  Eq.  (C4).  the  projection  of  —vEtxi  onto  the  tangent 
plane  to  h(i)  is  just 

p  = -2(2'2)"'2'v£(i).  (Co) 

For  each  estimate  x,  of  the  solution  we  have  hiiit)  =  0. 
The  reduced-gradient  method  calculates  xfhtz,).  2,.  and 
-7£(x  J  and  uses  these  with  Eq.  (C5)  to  determine  the  new 
search  direction: 

p,  =  -2,(2;2,)-‘2;v£(x,).  (C6) 

Once  p,  is  determined,  we  must  move  from  in  the  direc¬ 
tion  of  p,  to  find  the  next  estimate  i,.,.  However,  we  must 
have  h(i,.,)  =  0,  and.  in  general,  it  is  not  possible  to  find  a 
step  size  It  0  along  p.  such  that  h(i,  -f  y(,p*)  =  0.  It 
becomes  necessary  to  deviate  from  p,  to  return  to  the  surface 
for  our  next  estimate.  This  estimate  becomes 


=  i.  + 

(C7) 

hix,*,)  =  0, 

(C8) 

where  y,  and  qt  are  chosen  such  that  Eli,,,)  <  Eli,).  De¬ 
termining  the  scalar  step  size  yt  and  the  direction  back  to 
the  surface,  q*.  in  Eq.  (C7)  that  minimize  £(i,,i)  can  be  a 
complicated  subproblem. 

Rather  than  spending  too  much  computation  time  deter¬ 
mining  the  optimal  y>  and  q*.  we  opt  for  a  simpler  approach 
to  finding  an  i,,,  that  produces  a  sufficient  decrease  in  the 
objective  function.  We  do  this  by  ( 1 )  selecting  a  value  for 
then  (2i  using  i*  +  -y^p,  for  all  but  one  of  the  components  of 
i,,i.  and  then  (3)  using  Eq.  (C8)  to  determine  the  last  com¬ 
ponent.  Equation  (35)  is  an  example  of  Eq.  (C8)  for  solving 
for  the  component  b.  The  objective  function  is  evaluated  to 
determine  whether  there  is  a  sufficient  decrease.  If  we  are 
not  satisfied  with  the  new  estimate,  we  choose  another  value 
of  >,  and  repeat  the  procedure.  Using  this  procedure,  we 
can  think  of  the  objective  function  as  a  function  of  y  and  can 
set  -y,  to  the  value  of  y  that  minimizes  Eiy).  One  could  use 
any  of  a  number  of  standard  line  search  techniques  to  esti¬ 
mate  ■>,,  but  we  used  a  slightly  different  method  to  estimate 
this  minimum  and  to  find  x,,,. 

Iterative  Quadratic  Fit 

The  techr’que  implemented  to  minimize  E(y}  with  respect 
to  y  can  best  be  described  as  an  iterative  quadratic  fit  IIQF). 
It  uses  quadratic  curve  fitting  to  approximate  the  minimum 
of  £(7)  iteratively  and  thus  determine  7*.  The  description 
of  the  IQF  below  assumes  the  ability  to  fit  a  quad¬ 
ratic  polynomial  to  three  points: 

(1)  Initialize:  7i  “  7»i.' “  0.  71.  71. 

(2)  Calculate  £(71).  £17;).  and  £(7  1) 

(3)  Calculate  7„j.  the  value  of  7  that  corresponds  to  the 
minimum  of  the  quadratic  polynomial  in  7  fit  to  the  points 
l7i,  £(7,)].  l7  >,  £(7:)1.  |7  i.  £(7:i'I- 

(4)  Calculate  £(7 ml  I- 


(5)  IfSmi  -  7-:jl  <  then  7t*-  7- ;  and  stop;  otherwise 
continue  with  step  (6). 

(6)  Of  7 1.  7 _  and  7  ,.  find  the  two  that  are  closest  to  7  ~ 
Call  these  7, 1  and  7,  j. 

(7)  Set;  7i*~7  i.£i7;)*~£)7  i*. 

7_  —  7,j.  £17.1  £17  .1. 

7  ,  —  7„i-  £<•>  •  ^  £(7-:;). 

7mJ  *-  7ml. 

(8)  Go  to  step  (3). 

The  initial  values  of  7_.  and  7  should  be  chosen  based  on 
experimentation  and  observation  of  typical  £17)  versus  7 
curves.  These  values  are  not  crucial  to  the  success  of  the 
quadratic  fit  but  should  be  spaced  well  enough  to  give  a 
reasonable  initial  fit.  The  value  of  the  termination  parame¬ 
ter  d  should  be  based  on  the  degree  of  accuracy  needed  and 
should  be  chosen  large  enough  to  avoid  excessive  iterations. 

The  success  of  the  IQF  depends  largely  on  the  shape  of 
Eiy).  If  Ely)  is  not  fairly  smooth,  the  IQF  may  not  find  the 
actual  minimum;  this  is  not  a  problem  if  a  sufficient  decrease 
in  £  is  achieved.  A  more  difficult  problem  occurs  when  the 
projection  onto  the  tangent  plane  extends  into  a  region  of 
the  6-D  space  for  which  the  equation  for  a  return  to  the 
surface  is  not  defined.  As  an  example,  consider  using  Eq. 
(35)  to  return  to  the  surface  by  calculating  6  given  the  other 
five  variables.  If  a  range  of  values  of  7  exists  for  which  7  p, 
extends  into  the  region  where  e-  -  4d/  <  0.  then  6  (which  is 
by  definition  real  valued)  and  hence  £(7 )  will  not  be  defined 
over  this  range.  When  we  encountered  a  case  such  as  this, 
we  implemented  a  Fibonacci  line  search  '^  to  estimate  the 
minimum  of  £(7)  on  the  interval  7  for  which  £(7  >  is  defined 
It  should  be  stressed  that  these  potential  problems  arise  out 
of  the  method  used  here  to  return  to  the  ambiguity  surface, 
and  other  methods  exist  that  may  circumvent  this  but  that 
are  more  computationally  burdensome. 

SPECinCS  TO  THE  NEAREST-AMBIGUITY  SEARCH 
Since  we  have  discussed  the  constrained -mii  ■  aization  tech¬ 
nique  in  somewhat  general  terms  to  this  point,  let  us  now 
mention  some  details  and  summarize  the  procedure. 

The  gradient  of  £(i)  of  Eq.  (36)  can  be  computed  by  using 
the  following  relationship-’^: 

SE 

- «  2M.Vlg(x,  v)  -  g'(i,  v)],  (C9) 

dg(i.y) 

where 

DFT|g'(x.  v)|  Giu.  t ).  (CIO) 

IG(u.  i')i 

Since  the  ordering  of  the  pixels  of  g(x.  y )  in  the  vector  z  is 
defined,  Eq.  (C9)  can  be  used  to  calculate  the  components  of 
v£(i)  using  two  DPT’s  (since  lf(u.  i  )i  is  given). 

The  various  steps  of  the  reduced -gradient  constrained - 
minimization  algorithm  are  as  follows: 

1.  Initialization 

(a)  Determine  If (u.  i  )l. 

(b)  Make  an  initial  guess.  x„.  such  that  h(x.,i  =  0 

(c)  Compute  £(i,.). 

(d)  k^O. 

2.  Calculating  the  search  direction,  p, 

(a)  Compute  vbix,). 
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(b)  Form  Z#.  Z[. 

(c)  Compute  vEdi,). 

(d)  Compute  v£(i*). 

3.  Iterative  Quadratic  Fit  to  find  from  xi,  and  p* 

4.  If  [£(i,)  -  <  a, 

then:  Done;  estimate  of  minimum  is 
else:  (a)  k  k  +  1. 

(b)  Go  to  step  2. 

The  termination  condition  in  step  4  above  is  based  on  a 
percentage  change  between  successive  iterations.  The 
bound  a  is  selected  to  reflect  the  precision  of  the  estimate  of 
the  minimum.  While  it  may  be  tempting  to  use  the  condi¬ 
tion  that  —  v£  is  perpendicular  to  the  tangent  plane,  that  is, 

-'7£(i^.,)-p,^,  <  r  (cii) 

for  some  small  f,  it  is  also  difficult  to  pick  the  value  of  f  that 
will  consistently  give  us  the  same  confidence  in  the  precision 
of  our  estimate  without  choosing  it  so  small  that  it  causes 
needless  iterations  in  many  cases. 
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